Neural Architecture Search (NAS) is an automatic technique that can search for well-performed architectures for a specific task. Although NAS surpasses human-designed architecture in many fields, the high computational cost of architecture evaluation it requires hinders its development. A feasible solution is to directly evaluate some metrics in the initial stage of the architecture without any training. NAS without training (WOT) score is such a metric, which estimates the final trained accuracy of the architecture through the ability to distinguish different inputs in the activation layer. However, WOT score is not an atomic metric, meaning that it does not represent a fundamental indicator of the architecture. The contributions of this paper are in three folds. First, we decouple WOT into two atomic metrics which represent the distinguishing ability of the network and the number of activation units, and explore better combination rules named (Distinguishing Activation Score) DAS. We prove the correctness of decoupling theoretically and confirmed the effectiveness of the rules experimentally. Second, in order to improve the prediction accuracy of DAS to meet practical search requirements, we propose a fast training strategy. When DAS is used in combination with the fast training strategy, it yields more improvements. Third, we propose a dataset called Darts-training-bench (DTB), which fills the gap that no training states of architecture in existing datasets. Our proposed method has 1.04$\times$ - 1.56$\times$ improvements on NAS-Bench-101, Network Design Spaces, and the proposed DTB.
translated by 谷歌翻译
神经体系结构搜索(NAS)的关键挑战是迅速推断了广泛的网络的预测性能,以发现统计准确和计算高效的网络。我们将此任务称为模型性能推断(MPI)。当前的有效MPI实践是基于梯度的方法,可利用网络初始化的梯度来推断其性能。但是,现有的基于梯度的方法仅依赖启发式指标,并且缺乏必要的理论基础来巩固其设计。我们提出了GradSign,一种准确,简单且灵活的指标,用于使用理论见解的模型性能推断。 GradSign背后的关键思想是一个数量{\ psi},以分析单个训练样本粒度下不同网络的优化格局。从理论上讲,我们表明,在合理的假设下,网络的培训和真实的人口损失都由{\ psi}在相称的上限。此外,我们设计了GradSign,使用在随机初始化状态下评估的网络梯度对{\ psi}进行精确而简单的近似。对三个培训数据集的七个NAS基准进行评估表明,毕业生对现实世界的网络很好地推广,并且始终优于Spearman的{\ rho}和Kendall's Tau评估的基于最新的基于梯度的MPI。此外,我们将GradSign集成到四种现有的NAS算法中,并表明,通过将最佳发现网络的准确性提高高达0.3%,1.1%和1.0%,这三个现实世界任务的精确度提高了毕业生辅助的NAS算法的表现优于其香草。 。
translated by 谷歌翻译
神经体系结构搜索(NAS)可以自动为深神经网络(DNN)设计架构,并已成为当前机器学习社区中最热门的研究主题之一。但是,NAS通常在计算上很昂贵,因为在搜索过程中需要培训大量DNN。绩效预测因素可以通过直接预测DNN的性能来大大减轻NAS的过失成本。但是,构建令人满意的性能预测能力很大程度上取决于足够的训练有素的DNN体系结构,在大多数情况下很难获得。为了解决这个关键问题,我们在本文中提出了一种名为Giaug的有效的DNN体系结构增强方法。具体而言,我们首先提出了一种基于图同构的机制,其优点是有效地生成$ \ boldsymbol n $(即$ \ boldsymbol n!$)的阶乘,对具有$ \ boldsymbol n $ n $ n $ n $ \ boldsymbol n $的单个体系结构进行了带注释的体系结构节点。此外,我们还设计了一种通用方法,将体系结构编码为适合大多数预测模型的形式。结果,可以通过各种基于性能预测因子的NAS算法灵活地利用Giaug。我们在中小型,中,大规模搜索空间上对CIFAR-10和Imagenet基准数据集进行了广泛的实验。实验表明,Giaug可以显着提高大多数最先进的同伴预测因子的性能。此外,与最先进的NAS算法相比,Giaug最多可以在ImageNet上节省三级计算成本。
translated by 谷歌翻译
虽然可分辨率的架构搜索(飞镖)已成为神经结构中的主流范例(NAS),因为其简单和效率,最近的作品发现,搜索架构的性能几乎可以随着飞镖的优化程序而增加,以及最终的大小由飞镖获得几乎无法表明运营的重要性。上述观察表明,飞镖中的监督信号可能是架构搜索的穷人或不可靠的指标,鼓励有趣和有趣的方向:我们可以衡量不可分辨率范式下的任何培训的运作重要性吗?我们通过在初始化问题的网络修剪中定制NAS提供肯定的答案。随着最近建议的突触突触效力标准在初始化的网络修剪中,我们寻求在没有任何培训的情况下将候选人行动中的候选人行动的重要性进行评分,并提出了一种名为“免费可分辨的架构搜索}(Freedarts)的小说框架” 。我们表明,没有任何培训,具有不同代理度量的自由路由器可以在不同的搜索空间中优于大多数NAS基线。更重要的是,Freedarts是非常内存的高效和计算效率,因为它放弃了架构搜索阶段的培训,使得能够在更灵活的空间上执行架构搜索并消除架构搜索和评估之间的深度间隙。我们希望我们的工作激励从初始化修剪的角度来激发解决NAS的尝试。
translated by 谷歌翻译
Neural architecture search (NAS) is a promising research direction that has the potential to replace expert-designed networks with learned, task-specific architectures. In this work, in order to help ground the empirical results in this field, we propose new NAS baselines that build off the following observations: (i) NAS is a specialized hyperparameter optimization problem; and (ii) random search is a competitive baseline for hyperparameter optimization. Leveraging these observations, we evaluate both random search with early-stopping and a novel random search with weight-sharing algorithm on two standard NAS benchmarks-PTB and CIFAR-10. Our results show that random search with early-stopping is a competitive NAS baseline, e.g., it performs at least as well as ENAS [41], a leading NAS method, on both benchmarks. Additionally, random search with weight-sharing outperforms random search with early-stopping, achieving a state-of-the-art NAS result on PTB and a highly competitive result on CIFAR-10. Finally, we explore the existing reproducibility issues of published NAS results. We note the lack of source material needed to exactly reproduce these results, and further discuss the robustness of published results given the various sources of variability in NAS experimental setups. Relatedly, we provide all information (code, random seeds, documentation) needed to exactly reproduce our results, and report our random search with weight-sharing results for each benchmark on multiple runs.
translated by 谷歌翻译
神经体系结构搜索(NAS)的主要挑战之一是有效地对体系结构的性能进行排名。绩效排名者的主流评估使用排名相关性(例如,肯德尔的tau),这对整个空间都同样关注。但是,NAS的优化目标是识别顶级体系结构,同时对搜索空间中其他体系结构的关注更少。在本文中,我们从经验和理论上都表明,标准化的累积累积增益(NDCG)对于排名者来说是一个更好的指标。随后,我们提出了一种新算法Acenas,该算法直接通过Lambdarank优化NDCG。它还利用体重共享NAS产生的弱标签来预先培训排名,以便进一步降低搜索成本。对12个NAS基准和大规模搜索空间进行的广泛实验表明,我们的方法始终超过SOTA NAS方法,精度提高了3.67%,搜索成本降低了8倍。
translated by 谷歌翻译
近年来,可微弱的建筑搜索(飞镖)已经受到了大量的关注,主要是因为它通过重量分享和连续放松来显着降低计算成本。然而,更近期的作品发现现有的可分辨率NAS技术难以俯视幼稚基线,产生劣化架构作为搜索所需。本文通过将体系结构权重放入高斯分布,而不是直接优化架构参数,而不是直接优化架构参数,而是作为分布学习问题。通过利用自然梯度变分推理(NGVI),可以基于现有的码票来容易地优化架构分布而不会产生更多内存和计算消耗。我们展示了贝叶斯原则的可分解NAS如何益处,提高勘探和提高稳定性。 NAS-BENCH-201和NAS-BENCH-1SHOT1基准数据集的实验结果证实了所提出的框架可以制造的重要改进。此外,我们还在学习参数上只需简单地应用argmax,我们进一步利用了NAS中最近提出的无培训代理,从优化分布中汲取的组架构中选择最佳架构,从而实现最终的架构-ART在NAS-BENCH-201和NAS-BENCH-1SHOT1基准上的结果。我们在飞镖搜索空间中的最佳架构也会分别获得2.37 \%,15.72 \%和24.2 \%的竞争性测试错误,分别在Cifar-10,CiFar-100和Imagenet数据集上。
translated by 谷歌翻译
这项工作调查了神经架构搜索中的批量标准化(NAS)。具体来说,Frankle等人。发现培训Batchnorm只能实现非竞争性能。此外,陈等人。声称培训Batchnorm只能加快10次单次NAS超网关的培训。批判性地,没有努力理解1)为什么训练Batchnorm只能找到具有减少的超空网训练时间的表演井架构,而且2)列车-BN的超网和标准列车超空网之间有什么区别。我们首先显示列车-BN网络融合到神经切线内核制度,从理论上获得与所有参数的所有参数相同的训练动态。我们的证据支持索赔仅在超培训时间上训练Batchnorm。然后,我们经验披露了培训-BN的超标网络在其他运营商的卷曲中提供了优势,导致架构之间的不公平竞争。这是因为只有卷积运算符被附加到Batchnorm。通过实验,我们表明这种不公平性使得搜索算法容易选择具有卷积的模型。为了解决这个问题,我们通过在每个操作员上放置批处理层来引入搜索空间的公平性。然而,我们观察到Chen等人的性能预测因子。在新的搜索空间上不可应用。为此,我们提出了一种新颖的综合性能指标,从三个视角评估网络:源自Batchnorm的理论属性的表达性,培训和不确定性。我们展示了我们对多NAS基准的方法(NAS-BENCH101,NAS-BENCH-201)和搜索空间(飞镖搜索空间和MOBILENET搜索空间)的有效性。
translated by 谷歌翻译
Deep neural networks (DNNs) are found to be vulnerable to adversarial attacks, and various methods have been proposed for the defense. Among these methods, adversarial training has been drawing increasing attention because of its simplicity and effectiveness. However, the performance of the adversarial training is greatly limited by the architectures of target DNNs, which often makes the resulting DNNs with poor accuracy and unsatisfactory robustness. To address this problem, we propose DSARA to automatically search for the neural architectures that are accurate and robust after adversarial training. In particular, we design a novel cell-based search space specially for adversarial training, which improves the accuracy and the robustness upper bound of the searched architectures by carefully designing the placement of the cells and the proportional relationship of the filter numbers. Then we propose a two-stage search strategy to search for both accurate and robust neural architectures. At the first stage, the architecture parameters are optimized to minimize the adversarial loss, which makes full use of the effectiveness of the adversarial training in enhancing the robustness. At the second stage, the architecture parameters are optimized to minimize both the natural loss and the adversarial loss utilizing the proposed multi-objective adversarial training method, so that the searched neural architectures are both accurate and robust. We evaluate the proposed algorithm under natural data and various adversarial attacks, which reveals the superiority of the proposed method in terms of both accurate and robust architectures. We also conclude that accurate and robust neural architectures tend to deploy very different structures near the input and the output, which has great practical significance on both hand-crafting and automatically designing of accurate and robust neural architectures.
translated by 谷歌翻译
神经体系结构搜索方法寻求具有有效的体重共享超级网训练的最佳候选者。但是,最近的研究表明,关于独立架构和共享重量网络之间的性能的排名一致性差。在本文中,我们提出了提前引导的一声NAS(PGONA),以加强超级网的排名相关性。具体而言,我们首先探讨激活功能的效果,并提出基于三明治规则的平衡采样策略,以减轻超级网中的重量耦合。然后,采用了拖鞋和禅宗得分来指导超级网的训练,并具有排名相关性损失。我们的PGONA在CVPR2022第二轻型NAS挑战赛的SuperNet轨道中排名第三。代码可在https://github.com/pprp/cvpr2022-nas?competition-track1-3th-solution中找到。
translated by 谷歌翻译
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.
translated by 谷歌翻译
最近,已经成功地应用于各种遥感图像(RSI)识别任务的大量基于深度学习的方法。然而,RSI字段中深度学习方法的大多数现有进步严重依赖于手动设计的骨干网络提取的特征,这严重阻碍了由于RSI的复杂性以及先前知识的限制而受到深度学习模型的潜力。在本文中,我们研究了RSI识别任务中的骨干架构的新设计范式,包括场景分类,陆地覆盖分类和对象检测。提出了一种基于权重共享策略和进化算法的一拍架构搜索框架,称为RSBNet,其中包括三个阶段:首先,在层面搜索空间中构造的超空网是在自组装的大型中预先磨削 - 基于集合单路径培训策略进行缩放RSI数据集。接下来,预先培训的SuperNet通过可切换识别模块配备不同的识别头,并分别在目标数据集上进行微调,以获取特定于任务特定的超网络。最后,我们根据没有任何网络训练的进化算法,搜索最佳骨干架构进行不同识别任务。对于不同识别任务的五个基准数据集进行了广泛的实验,结果显示了所提出的搜索范例的有效性,并证明搜索后的骨干能够灵活地调整不同的RSI识别任务并实现令人印象深刻的性能。
translated by 谷歌翻译
神经体系结构搜索(NAS)是自动化有效图像处理DNN设计的强大工具。该排名已被倡导为NAS设计有效的性能预测指标。先前的对比方法通过比较架构对并预测其相对性能来解决排名问题。但是,它仅关注两个相关建筑之间的排名,而忽略了搜索空间的整体质量分布,这可能会遇到概括性问题。提出了一个预测因子,即专注于特定体系结构的全球质量层的神经体系结构排名,以解决由当地观点引起的此类问题。 NAR在全球范围内探索搜索空间的质量层,并根据其全球排名将每个人分类为他们所属的层。因此,预测变量获得了搜索空间的性能分布的知识,这有助于更轻松地将其排名能力推广到数据集。同时,全球质量分布通过根据质量层的统计数据直接对候选者进行采样,从而促进了搜索阶段,而质量层的统计数据没有培训搜索算法,例如增强型学习(RL)或进化算法(EA),因此简化了NAS管道并保存计算开销。拟议的NAR比在两个广泛使用的NAS研究数据集上的最先进方法取得了更好的性能。在NAS-Bench-101的庞大搜索空间中,NAR可以轻松地找到具有最高0.01 $ \ unicode {x2030} $ performance的架构。它还可以很好地概括为NAS Bench-201的不同图像数据集,即CIFAR-10,CIFAR-100和Imagenet-16-120,通过识别每个它们的最佳体系结构。
translated by 谷歌翻译
神经体系结构搜索(NAS)最近在深度学习社区中变得越来越流行,主要是因为它可以提供一个机会,使感兴趣的用户没有丰富的专业知识,从而从深度神经网络(DNNS)的成功中受益。但是,NAS仍然很费力且耗时,因为在NAS的搜索过程中需要进行大量的性能估计,并且训练DNNS在计算上是密集的。为了解决NAS的主要局限性,提高NAS的效率对于NAS的设计至关重要。本文以简要介绍了NAS的一般框架。然后,系统地讨论了根据代理指标评估网络候选者的方法。接下来是对替代辅助NAS的描述,该NAS分为三个不同类别,即NAS的贝叶斯优化,NAS的替代辅助进化算法和NAS的MOP。最后,讨论了剩余的挑战和开放研究问题,并在这个新兴领域提出了有希望的研究主题。
translated by 谷歌翻译
神经体系结构搜索(NAS)旨在自动化体系结构设计过程并改善深神经网络的性能。平台感知的NAS方法同时考虑性能和复杂性,并且可以找到具有低计算资源的表现良好的体系结构。尽管普通的NAS方法由于模型培训的重复而导致了巨大的计算成本,但在搜索过程中,训练包含所有候选架构的超级网的权重训练了一杆NAS,据报道会导致搜索成本较低。这项研究着重于体系结构复杂性的单发NAS,该NA优化了由两个指标的加权总和组成的目标函数,例如预测性能和参数数量。在现有方法中,必须使用加权总和的不同系数多次运行架构搜索过程,以获得具有不同复杂性的多个体系结构。这项研究旨在降低与寻找多个体系结构相关的搜索成本。提出的方法使用多个分布来生成具有不同复杂性的体系结构,并使用基于重要性采样的多个分布获得的样本来更新每个分布。提出的方法使我们能够在单个体系结构搜索中获得具有不同复杂性的多个体系结构,从而降低了搜索成本。所提出的方法应用于CIAFR-10和Imagenet数据集上卷积神经网络的体系结构搜索。因此,与基线方法相比,提出的方法发现了多个复杂性不同的架构,同时需要减少计算工作。
translated by 谷歌翻译
神经架构搜索(NAS)在神经网络(NN)的设计和部署方面具有显着提高的生产率。由于NAS通常通过部分或完全训练多个模型来评估多个模型,因此提高的生产率是以大量碳足迹为代价的。为了减轻这种昂贵的训练例程,零击/成本代理在初始化时分析了NN以产生分数,这与其真正的准确性高度相关。零成本代理目前是由专家设计的,这些专家对可能的算法,数据集和神经体系结构设计空间进行了多个经验测试。这降低了生产率,并且是对零成本代理设计的一种不可持续的方法,因为深度学习用例本质上多样化。此外,现有的零成本代理无法跨越神经体系结构设计空间。在本文中,我们提出了一个基因编程框架,以自动化发现零成本代理以进行神经体系结构评分。我们的方法有效地发现了一个可解释且可推广的零成本代理,该代理在NASBENCH-2010和网络设计空间(NDS)的所有数据集和搜索空间上提供了最高得分 - 准确性的相关性。我们认为,这项研究表明了自动发现可以跨网络体系结构设计空间,数据集和任务的零成本代理的有希望的方向。
translated by 谷歌翻译
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation. NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations. Recent NAS works start to explore indicators that can predict a network's performance without training. However, they either leveraged limited properties of deep networks, or the benefits of their training-free indicators are not applied to more extensive search methods. By rigorous correlation analysis, we present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks - Trainability, Expressivity, Generalization - all assessed in a training-free manner. The TEG indicators could be scaled up and integrated with various NAS search methods, including both supernet and single-path approaches. Extensive studies validate the effective and efficient guidance from our TEG-NAS framework, leading to both improved search accuracy and over 56% reduction in search time cost. Moreover, we visualize search trajectories on three landscapes of "TEG" characteristics, observing that while a good local minimum is easier to find on NAS-Bench-201 given its simple topology, balancing "TEG" characteristics is much harder on the DARTS search space due to its complex landscape geometry. Our code is available at https://github.com/VITA-Group/TEGNAS.
translated by 谷歌翻译
许多现有的神经结构搜索(NAS)解决方案依赖于架构评估的下游培训,这需要巨大的计算。考虑到这些计算带来了大量碳足迹,本文旨在探索绿色(即环保)NAS解决方案,可以在不培训的情况下评估架构。直观地,由架构本身引起的梯度,直接决定收敛和泛化结果。它激励我们提出梯度内核假设:梯度可以用作下游训练的粗粒粒度,以评估随机初始化网络。为了支持假设,我们进行理论分析,找到一个实用的梯度内核,与培训损失和验证性能有良好的相关性。根据这一假设,我们提出了一种新的基于内核的架构搜索方法knas。实验表明,KNA可实现比图像分类任务的“火车-TER-TEST”范式更快地实现竞争力。此外,极低的搜索成本使其具有广泛的应用。搜索网络还优于两个文本分类任务的强大基线Roberta-Light。代码可用于\ url {https://github.com/jingjing-nlp/knas}。
translated by 谷歌翻译
Deep Learning has enabled remarkable progress over the last years on a variety of tasks, such as image recognition, speech recognition, and machine translation. One crucial aspect for this progress are novel neural architectures. Currently employed architectures have mostly been developed manually by human experts, which is a time-consuming and errorprone process. Because of this, there is growing interest in automated neural architecture search methods. We provide an overview of existing work in this field of research and categorize them according to three dimensions: search space, search strategy, and performance estimation strategy.
translated by 谷歌翻译
Recent advances in neural architecture search (NAS) demand tremendous computational resources, which makes it difficult to reproduce experiments and imposes a barrier-to-entry to researchers without access to large-scale computation. We aim to ameliorate these problems by introducing NAS-Bench-101, the first public architecture dataset for NAS research. To build NAS-Bench-101, we carefully constructed a compact, yet expressive, search space, exploiting graph isomorphisms to identify 423k unique convolutional architectures. We trained and evaluated all of these architectures multiple times on CIFAR-10 and compiled the results into a large dataset of over 5 million trained models. This allows researchers to evaluate the quality of a diverse range of models in milliseconds by querying the precomputed dataset. We demonstrate its utility by analyzing the dataset as a whole and by benchmarking a range of architecture optimization algorithms.
translated by 谷歌翻译