Recent advances in neural architecture search (NAS) demand tremendous computational resources, which makes it difficult to reproduce experiments and imposes a barrier-to-entry to researchers without access to large-scale computation. We aim to ameliorate these problems by introducing NAS-Bench-101, the first public architecture dataset for NAS research. To build NAS-Bench-101, we carefully constructed a compact, yet expressive, search space, exploiting graph isomorphisms to identify 423k unique convolutional architectures. We trained and evaluated all of these architectures multiple times on CIFAR-10 and compiled the results into a large dataset of over 5 million trained models. This allows researchers to evaluate the quality of a diverse range of models in milliseconds by querying the precomputed dataset. We demonstrate its utility by analyzing the dataset as a whole and by benchmarking a range of architecture optimization algorithms.
translated by 谷歌翻译
神经结构中的标准范例(NAS)是搜索具有特定操作和连接的完全确定性体系结构。在这项工作中,我们建议寻找最佳运行分布,从而提供了一种随机和近似解,可用于采样任意长度的架构。我们提出并显示,给定架构单元格,其性能主要取决于使用的操作的比率,而不是典型的搜索空间中的任何特定连接模式;也就是说,操作排序的小变化通常是无关紧要的。这种直觉与任何特定的搜索策略都具有正交,并且可以应用于多样化的NAS算法。通过对4数据集和4个NAS技术的广泛验证(贝叶斯优化,可分辨率搜索,本地搜索和随机搜索),我们表明操作分布(1)保持足够的辨别力来可靠地识别解决方案,并且(2)显着识别比传统的编码更容易优化,导致大量速度,几乎没有成本性能。实际上,这种简单的直觉显着降低了电流方法的成本,并可能使NAS用于更广泛的应用中。
translated by 谷歌翻译
Modern deep learning methods are very sensitive to many hyperparameters, and, due to the long training times of state-of-the-art models, vanilla Bayesian hyperparameter optimization is typically computationally infeasible. On the other hand, bandit-based configuration evaluation approaches based on random search lack guidance and do not converge to the best configurations as quickly. Here, we propose to combine the benefits of both Bayesian optimization and banditbased methods, in order to achieve the best of both worlds: strong anytime performance and fast convergence to optimal configurations. We propose a new practical state-of-the-art hyperparameter optimization method, which consistently outperforms both Bayesian optimization and Hyperband on a wide range of problem types, including high-dimensional toy functions, support vector machines, feed-forward neural networks, Bayesian neural networks, deep reinforcement learning, and convolutional neural networks. Our method is robust and versatile, while at the same time being conceptually simple and easy to implement.
translated by 谷歌翻译
Neural architecture search (NAS) is a promising research direction that has the potential to replace expert-designed networks with learned, task-specific architectures. In this work, in order to help ground the empirical results in this field, we propose new NAS baselines that build off the following observations: (i) NAS is a specialized hyperparameter optimization problem; and (ii) random search is a competitive baseline for hyperparameter optimization. Leveraging these observations, we evaluate both random search with early-stopping and a novel random search with weight-sharing algorithm on two standard NAS benchmarks-PTB and CIFAR-10. Our results show that random search with early-stopping is a competitive NAS baseline, e.g., it performs at least as well as ENAS [41], a leading NAS method, on both benchmarks. Additionally, random search with weight-sharing outperforms random search with early-stopping, achieving a state-of-the-art NAS result on PTB and a highly competitive result on CIFAR-10. Finally, we explore the existing reproducibility issues of published NAS results. We note the lack of source material needed to exactly reproduce these results, and further discuss the robustness of published results given the various sources of variability in NAS experimental setups. Relatedly, we provide all information (code, random seeds, documentation) needed to exactly reproduce our results, and report our random search with weight-sharing results for each benchmark on multiple runs.
translated by 谷歌翻译
Deep Learning has enabled remarkable progress over the last years on a variety of tasks, such as image recognition, speech recognition, and machine translation. One crucial aspect for this progress are novel neural architectures. Currently employed architectures have mostly been developed manually by human experts, which is a time-consuming and errorprone process. Because of this, there is growing interest in automated neural architecture search methods. We provide an overview of existing work in this field of research and categorize them according to three dimensions: search space, search strategy, and performance estimation strategy.
translated by 谷歌翻译
The effort devoted to hand-crafting neural network image classifiers has motivated the use of architecture search to discover them automatically. Although evolutionary algorithms have been repeatedly applied to neural network topologies, the image classifiers thus discovered have remained inferior to human-crafted ones. Here, we evolve an image classifier-AmoebaNet-A-that surpasses hand-designs for the first time.To do this, we modify the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes. Matching size, AmoebaNet-A has comparable accuracy to current state-of-the-art ImageNet models discovered with more complex architecture-search methods. Scaled to larger size, AmoebaNet-A sets a new state-of-theart 83.9% top-1 / 96.6% top-5 ImageNet accuracy. In a controlled comparison against a well known reinforcement learning algorithm, we give evidence that evolution can obtain results faster with the same hardware, especially at the earlier stages of the search. This is relevant when fewer compute resources are available. Evolution is, thus, a simple method to effectively discover high-quality architectures. Related WorkReview papers provide informative surveys of earlier [18,49] and more recent [15] literature on image classifier architecture search, including successful RL studies [2,6,29,[52][53][54] and evolutionary studies like those mentioned in 1 After our submission, a recent preprint has further scaled up and retrained AmoebaNet-A to reach 84.3% top-1 / 97.0% top-5 ImageNet accuracy [25].
translated by 谷歌翻译
为了实现峰值预测性能,封路计优化(HPO)是机器学习的重要组成部分及其应用。在过去几年中,HPO的有效算法和工具的数量大幅增加。与此同时,社区仍缺乏现实,多样化,计算廉价和标准化的基准。这是多保真HPO方法的情况。为了缩短这个差距,我们提出了HPoBench,其中包括7个现有和5个新的基准家庭,共有100多个多保真基准问题。 HPobench允许以可重复的方式运行该可扩展的多保真HPO基准,通过隔离和包装容器中的各个基准。它还提供了用于计算实惠且统计数据的评估的代理和表格基准。为了展示HPoBench与各种优化工具的广泛兼容性,以及其有用性,我们开展了一个来自6个优化工具的13个优化器的示例性大规模研究。我们在这里提供HPobench:https://github.com/automl/hpobench。
translated by 谷歌翻译
The automated machine learning (AutoML) field has become increasingly relevant in recent years. These algorithms can develop models without the need for expert knowledge, facilitating the application of machine learning techniques in the industry. Neural Architecture Search (NAS) exploits deep learning techniques to autonomously produce neural network architectures whose results rival the state-of-the-art models hand-crafted by AI experts. However, this approach requires significant computational resources and hardware investments, making it less appealing for real-usage applications. This article presents the third version of Pareto-Optimal Progressive Neural Architecture Search (POPNASv3), a new sequential model-based optimization NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The algorithm employs Pareto optimality to reduce the number of architectures sampled during the search, drastically improving the time efficiency without loss in accuracy. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
translated by 谷歌翻译
虽然神经结构的早期研究搜索(NAS)所需的极端计算资源,但最近的表格和代理基准的版本大大提高了NAS研究的速度和再现性。但是,两个最受欢迎的基准测试不为每个架构提供完整的培训信息。结果,在这些基准上,不可能运行许多类型的多保真技术,例如学习曲线外推,这些技术需要在任意时期评估架构。在这项工作中,我们介绍了一种使用奇异值分解和噪声建模的方法来创建代理基准,NAS-Bench-111,NAS-BENCH-311和NAS-BENCH-NLP11,其输出每个架构的完整培训信息而不是最终的验证准确性。我们通过引入学习曲线外推框架来修改单一保真算法来展示使用完整培训信息的力量,示出它导致改进流行的单保真算法,该算法在释放时声称最先进的单一保真算法。我们的代码和预用模型可在https://github.com/automl/nas-bench-x11中获得。
translated by 谷歌翻译
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.
translated by 谷歌翻译
There is growing interest in automating neural network architecture design. Existing architecture search methods can be computationally expensive, requiring thousands of different architectures to be trained from scratch. Recent work has explored weight sharing across models to amortize the cost of training. Although previous methods reduced the cost of architecture search by orders of magnitude, they remain complex, requiring hypernetworks or reinforcement learning controllers. We aim to understand weight sharing for one-shot architecture search. With careful experimental analysis, we show that it is possible to efficiently identify promising architectures from a complex search space without either hypernetworks or RL.
translated by 谷歌翻译
神经建筑搜索(NAS)已被广泛研究,并已成长为具有重大影响的研究领域。虽然经典的单目标NAS搜索具有最佳性能的体系结构,但多目标NAS考虑了应同时优化的多个目标,例如,将沿验证错误最小化资源使用率。尽管在多目标NAS领域已经取得了长足的进步,但我们认为实际关注的实际优化问题与多目标NAS试图解决的优化问题之间存在一些差异。我们通过将多目标NAS问题作为质量多样性优化(QDO)问题来解决这一差异,并引入了三种质量多样性NAS优化器(其中两个属于多重速度优化器组),以寻求高度多样化但多样化的体系结构对于特定于应用程序特定的利基,例如硬件约束。通过将这些优化器与它们的多目标对应物进行比较,我们证明了质量多样性总体上优于多目标NA在解决方案和效率方面。我们进一步展示了应用程序和未来的NAS研究如何在QDO上蓬勃发展。
translated by 谷歌翻译
神经体系结构搜索(NAS)可以自动为深神经网络(DNN)设计架构,并已成为当前机器学习社区中最热门的研究主题之一。但是,NAS通常在计算上很昂贵,因为在搜索过程中需要培训大量DNN。绩效预测因素可以通过直接预测DNN的性能来大大减轻NAS的过失成本。但是,构建令人满意的性能预测能力很大程度上取决于足够的训练有素的DNN体系结构,在大多数情况下很难获得。为了解决这个关键问题,我们在本文中提出了一种名为Giaug的有效的DNN体系结构增强方法。具体而言,我们首先提出了一种基于图同构的机制,其优点是有效地生成$ \ boldsymbol n $(即$ \ boldsymbol n!$)的阶乘,对具有$ \ boldsymbol n $ n $ n $ n $ \ boldsymbol n $的单个体系结构进行了带注释的体系结构节点。此外,我们还设计了一种通用方法,将体系结构编码为适合大多数预测模型的形式。结果,可以通过各种基于性能预测因子的NAS算法灵活地利用Giaug。我们在中小型,中,大规模搜索空间上对CIFAR-10和Imagenet基准数据集进行了广泛的实验。实验表明,Giaug可以显着提高大多数最先进的同伴预测因子的性能。此外,与最先进的NAS算法相比,Giaug最多可以在ImageNet上节省三级计算成本。
translated by 谷歌翻译
Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone. Our goal is to minimize human participation, so we employ evolutionary algorithms to discover such networks automatically. Despite significant computational requirements, we show that it is now possible to evolve models with accuracies within the range of those published in the last year. Specifically, we employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions and reaching accuracies of 94.6% (95.6% for ensemble) and 77.0%, respectively. To do this, we use novel and intuitive mutation operators that navigate large search spaces; we stress that no human participation is required once evolution starts and that the output is a fully-trained model. Throughout this work, we place special emphasis on the repeatability of results, the variability in the outcomes and the computational requirements.
translated by 谷歌翻译
大多数机器学习算法由一个或多个超参数配置,必须仔细选择并且通常会影响性能。为避免耗时和不可递销的手动试验和错误过程来查找性能良好的超参数配置,可以采用各种自动超参数优化(HPO)方法,例如,基于监督机器学习的重新采样误差估计。本文介绍了HPO后,本文审查了重要的HPO方法,如网格或随机搜索,进化算法,贝叶斯优化,超带和赛车。它给出了关于进行HPO的重要选择的实用建议,包括HPO算法本身,性能评估,如何将HPO与ML管道,运行时改进和并行化结合起来。这项工作伴随着附录,其中包含关于R和Python的特定软件包的信息,以及用于特定学习算法的信息和推荐的超参数搜索空间。我们还提供笔记本电脑,这些笔记本展示了这项工作的概念作为补充文件。
translated by 谷歌翻译
This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.
translated by 谷歌翻译
自动化封路计优化(HPO)已经获得了很大的普及,并且是大多数自动化机器学习框架的重要成分。然而,设计HPO算法的过程仍然是一个不系统和手动的过程:确定了现有工作的限制,提出的改进是 - 即使是专家知识的指导 - 仍然是一定任意的。这很少允许对哪些算法分量的驾驶性能进行全面了解,并且承载忽略良好算法设计选择的风险。我们提出了一个原理的方法来实现应用于多倍性HPO(MF-HPO)的自动基准驱动算法设计的原则方法:首先,我们正式化包括的MF-HPO候选的丰富空间,但不限于普通的HPO算法,然后呈现可配置的框架覆盖此空间。要自动和系统地查找最佳候选者,我们遵循通过优化方法,并通过贝叶斯优化搜索算法候选的空间。我们挑战是否必须通过执行消融分析来挑战所发现的设计选择或可以通过更加天真和更简单的设计。我们观察到使用相对简单的配置,在某些方式中比建立的方法更简单,只要某些关键配置参数具有正确的值,就可以很好地执行得很好。
translated by 谷歌翻译
神经结构搜索是一个有前途的研究领域,致力于自动化神经网络模型的设计。该领域正在迅速增长,具有从贝叶斯优化,神经间偏离的方法的浪涌,以及各种情况下的应用程序。然而,尽管存在巨大的进展,但很少有研究对问题本身的难度提出了见解,因此这些方法的成功(或失败)仍未解释。从这个意义上讲,优化领域已经开发了突出显示关键方面来描述优化问题的方法。适应性景观分析突出了可靠和定量搜索算法的特征时。在本文中,我们建议使用健身景观分析来研究神经结构搜索问题。特别是,我们介绍了健身景观足迹,八(8)个通用指标的聚合来综合架构搜索问题的景观。我们研究了两个问题,古典图像分类基准CiFar-10和遥感问题SO2SAT LCZ42。结果表现了对问题的定量评估,允许表征相对难度和其他特征,例如坚固性或持久性,有助于定制对问题的搜索策略。此外,足迹是一种能够比较多次问题的工具。
translated by 谷歌翻译
神经体系结构搜索(NAS)的主要挑战之一是有效地对体系结构的性能进行排名。绩效排名者的主流评估使用排名相关性(例如,肯德尔的tau),这对整个空间都同样关注。但是,NAS的优化目标是识别顶级体系结构,同时对搜索空间中其他体系结构的关注更少。在本文中,我们从经验和理论上都表明,标准化的累积累积增益(NDCG)对于排名者来说是一个更好的指标。随后,我们提出了一种新算法Acenas,该算法直接通过Lambdarank优化NDCG。它还利用体重共享NAS产生的弱标签来预先培训排名,以便进一步降低搜索成本。对12个NAS基准和大规模搜索空间进行的广泛实验表明,我们的方法始终超过SOTA NAS方法,精度提高了3.67%,搜索成本降低了8倍。
translated by 谷歌翻译
大多数现有的神经体系结构搜索(NAS)基准和算法优先考虑了良好的任务,例如CIFAR或Imagenet上的图像分类。这使得在更多样化的领域的NAS方法的表现知之甚少。在本文中,我们提出了NAS-Bench-360,这是一套基准套件,用于评估超出建筑搜索传统研究的域的方法,并使用它来解决以下问题:最先进的NAS方法在多样化的任务?为了构建基准测试,我们策划了十个任务,这些任务涵盖了各种应用程序域,数据集大小,问题维度和学习目标。小心地选择每个任务与现代CNN的搜索方法互操作,同时可能与其原始开发领域相距遥远。为了加快NAS研究的成本,对于其中两个任务,我们发布了包括标准CNN搜索空间的15,625个体系结构的预定性能。在实验上,我们表明需要对NAS BENCH-360进行更强大的NAS评估,从而表明几种现代NAS程序在这十个任务中执行不一致,并且有许多灾难性差的结果。我们还展示了NAS Bench-360及其相关的预算结果将如何通过测试NAS文献中最近推广的一些假设来实现未来的科学发现。 NAS-Bench-360托管在https://nb360.ml.cmu.edu上。
translated by 谷歌翻译