在本文中,我们提出了MENAS,这是一种有效的基于多试剂进化的NAS方法,人类干预较少。具体而言,我们提出了一个扩大的搜索空间(Mobilenet3-MT),用于Imagenet-1K,并提高两个方面的搜索效率。首先,MENAS共同探索建筑和最佳修剪候选人(彩票),逐渐减少了人口中的平均模型。每种型号都经过培训,并由其彩票票取代,而不是首先搜索繁琐的网络然后进行修剪。其次,我们介绍了个人体重共享,该分享专门用于多重试验NAS,旨在通过分享父母和子女网络之间的权重来摊销培训成本。与超级网的重量共享相比,单个体重分享的排名一致性更为可靠,同时通过防止复杂的超级网训练易于实现。此外,为了使被困在小型模型中的进化过程正规化,在制定父群体时,我们保留了最大模型的小比例,这被证明有益于增强模型性能。广泛的实验结果证明了十分的优势。在ImagEnet-1K数据库上,MENA可实现80.5%的TOP-1准确性,而无需涉及知识蒸馏或更大的图像分辨率。代码和型号将可用。
translated by 谷歌翻译
神经结构搜索(NAS)引起了日益增长的兴趣。为了降低搜索成本,最近的工作已经探讨了模型的重量分享,并在单枪NAS进行了重大进展。然而,已经观察到,单次模型精度较高的模型并不一定在独立培训时更好地执行更好。为了解决这个问题,本文提出了搜索空间的逐步自动设计,名为Pad-NAS。与超字幕中的所有层共享相同操作搜索空间的先前方法不同,我们根据操作修剪制定逐行搜索策略,并构建层面操作搜索空间。通过这种方式,Pad-NAS可以自动设计每层的操作,并在搜索空间质量和模型分集之间实现权衡。在搜索过程中,我们还考虑了高效神经网络模型部署的硬件平台约束。关于Imagenet的广泛实验表明我们的方法可以实现最先进的性能。
translated by 谷歌翻译
We revisit the one-shot Neural Architecture Search (NAS) paradigm and analyze its advantages over existing NAS approaches. Existing one-shot method, however, is hard to train and not yet effective on large scale datasets like ImageNet. This work propose a Single Path One-Shot model to address the challenge in the training. Our central idea is to construct a simplified supernet, where all architectures are single paths so that weight co-adaption problem is alleviated. Training is performed by uniform path sampling. All architectures (and their weights) are trained fully and equally. Comprehensive experiments verify that our approach is flexible and effective. It is easy to train and fast to search. It effortlessly supports complex search spaces (e.g., building blocks, channel, mixed-precision quantization) and different search constraints (e.g., FLOPs, latency). It is thus convenient to use for various needs. It achieves start-of-the-art performance on the large dataset ImageNet.Equal contribution. This work is done when Haoyuan Mu and Zechun Liu are interns at MEGVII Technology.
translated by 谷歌翻译
In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks. We first train a PruningNet, a kind of meta network, which is able to generate weight parameters for any pruned structure given the target network. We use a simple stochastic structure sampling method for training the PruningNet. Then, we apply an evolutionary procedure to search for good-performing pruned networks. The search is highly efficient because the weights are directly generated by the trained PruningNet and we do not need any finetuning at search time. With a single PruningNet trained for the target network, we can search for various Pruned Networks under different constraints with little human participation. Compared to the state-of-the-art pruning methods, we have demonstrated superior performances on Mo-bileNet V1/V2 and ResNet. Codes are available on https: //github.com/liuzechun/MetaPruning. This work is done when Zechun Liu and Haoyuan Mu are interns at Megvii Technology.
translated by 谷歌翻译
深度学习技术在各种任务中都表现出了出色的有效性,并且深度学习具有推进多种应用程序(包括在边缘计算中)的潜力,其中将深层模型部署在边缘设备上,以实现即时的数据处理和响应。一个关键的挑战是,虽然深层模型的应用通常会产生大量的内存和计算成本,但Edge设备通常只提供非常有限的存储和计算功能,这些功能可能会在各个设备之间差异很大。这些特征使得难以构建深度学习解决方案,以释放边缘设备的潜力,同时遵守其约束。应对这一挑战的一种有希望的方法是自动化有效的深度学习模型的设计,这些模型轻巧,仅需少量存储,并且仅产生低计算开销。该调查提供了针对边缘计算的深度学习模型设计自动化技术的全面覆盖。它提供了关键指标的概述和比较,这些指标通常用于量化模型在有效性,轻度和计算成本方面的水平。然后,该调查涵盖了深层设计自动化技术的三类最新技术:自动化神经体系结构搜索,自动化模型压缩以及联合自动化设计和压缩。最后,调查涵盖了未来研究的开放问题和方向。
translated by 谷歌翻译
Pure transformers have shown great potential for vision tasks recently. However, their accuracy in small or medium datasets is not satisfactory. Although some existing methods introduce a CNN as a teacher to guide the training process by distillation, the gap between teacher and student networks would lead to sub-optimal performance. In this work, we propose a new One-shot Vision transformer search framework with Online distillation, namely OVO. OVO samples sub-nets for both teacher and student networks for better distillation results. Benefiting from the online distillation, thousands of subnets in the supernet are well-trained without extra finetuning or retraining. In experiments, OVO-Ti achieves 73.32% top-1 accuracy on ImageNet and 75.2% on CIFAR-100, respectively.
translated by 谷歌翻译
本文旨在探讨神经架构搜索(NAS)的可行性仅在不使用任何原始训练数据的情况下给出预先训练的模型。这是实质保护,偏离避免等的重要情况。为实现这一目标,我们首先通过从预先训练的深神经网络中恢复知识来综合可用数据。然后我们使用合成数据及其预测的软标签来指导神经结构搜索。我们确定NAS任务需要具有足够的语义,多样性和来自自然图像的最小域间隙的合成数据(我们在此处瞄准)。对于语义,我们提出了递归标签校准,以产生更多的信息性输出。对于多样性,我们提出了一个区域更新策略,以产生更多样化和富集的合成数据。对于最小的域间隙,我们使用输入和特征级正则化来模拟潜在空间的原始数据分布。我们将我们提出的三个流行NAS算法实例化:飞镖,Proxylessnas和Spos。令人惊讶的是,我们的结果表明,通过搜索我们的合成数据来实现的架构,实现了与从原始的架构中搜索的架构相当的准确性,首次导出了NAS可以有效完成的结论如果合成方法设计良好,则无需访问原件或称为自然数据。我们的代码将公开提供。
translated by 谷歌翻译
最近,已经成功地应用于各种遥感图像(RSI)识别任务的大量基于深度学习的方法。然而,RSI字段中深度学习方法的大多数现有进步严重依赖于手动设计的骨干网络提取的特征,这严重阻碍了由于RSI的复杂性以及先前知识的限制而受到深度学习模型的潜力。在本文中,我们研究了RSI识别任务中的骨干架构的新设计范式,包括场景分类,陆地覆盖分类和对象检测。提出了一种基于权重共享策略和进化算法的一拍架构搜索框架,称为RSBNet,其中包括三个阶段:首先,在层面搜索空间中构造的超空网是在自组装的大型中预先磨削 - 基于集合单路径培训策略进行缩放RSI数据集。接下来,预先培训的SuperNet通过可切换识别模块配备不同的识别头,并分别在目标数据集上进行微调,以获取特定于任务特定的超网络。最后,我们根据没有任何网络训练的进化算法,搜索最佳骨干架构进行不同识别任务。对于不同识别任务的五个基准数据集进行了广泛的实验,结果显示了所提出的搜索范例的有效性,并证明搜索后的骨干能够灵活地调整不同的RSI识别任务并实现令人印象深刻的性能。
translated by 谷歌翻译
结构重新参数化(REP)方法已在传统的卷积网络上取得了重大的性能提高。大多数当前的REP方法依靠先验知识来选择重新聚集操作。但是,体系结构的性能受到操作类型和先验知识的限制。为了打破这项限制,在这项工作中,设计了改进的重新参数化搜索空间,其中包括更多类型的重新参数操作。具体而言,搜索空间可以进一步提高卷积网络的性能。为了有效地探索该搜索空间,基于神经体系结构搜索(NAS)设计了自动重新参数增强策略,该策略可以搜索出色的重新参数化体系结构。此外,我们可视化体系结构的输出功能,以分析形成重新参数架构的原因。在公共数据集中,我们取得了更好的结果。在与RESNET相同的训练条件下,我们将Resnet-50的准确性提高了Imagenet-1K的1.82%。
translated by 谷歌翻译
COVID-19大流行威胁着全球健康。许多研究应用了深度卷积神经网络(CNN),以识别基于胸部3D计算机断层扫描(CT)的COVID-19。最近的作品表明,没有模型在不同国家 /地区的CT数据集中概括得很好,并且为特定数据集设计模型需要专业知识。因此,旨在自动搜索模型的神经体系结构搜索(NAS)已成为一个有吸引力的解决方案。为了降低大型3D CT数据集的搜索成本,大多数基于NAS的作品都使用权重共享(WS)策略来使所有型号在超级网中共享权重。但是,WS不可避免地会导致搜索不稳定性,从而导致模型估计不准确。在这项工作中,我们提出了一个有效的进化多目标架构搜索(EMARS)框架。我们提出了一个新的目标,即潜在的潜力,可以帮助利用有前途的模型间接减少权重训练中涉及的模型数量,从而减轻搜索不稳定性。我们证明,在准确性和潜力的目标下,EMAR可以平衡剥削和探索,即减少搜索时间并找到更好的模型。我们的搜索模型很小,并且比在三个公共Covid-19 3D CT数据集上的先前工作表现更好。
translated by 谷歌翻译
在过去几年中,已经制作了神经结构搜索领域的显着改进。然而,由于存在搜索的约束和实际推断时间之间的间隙,搜索有效网络仍然具有挑战性。为了搜索具有低推理时间的高性能网络,若干以前的作品为搜索算法设置了计算复杂性约束。然而,许多因素影响推理的速度(例如,拖鞋,MAC)。单个指示符与延迟之间的相关性并不强。目前,提出了一些重新参数化(REP)技术将多分支转换为对单路径架构进行推断友好的。然而,多分支架构仍然是人为定义和效率低下。在这项工作中,我们提出了一种适用于结构重新参数化技术的新搜索空间。 repnas是一种单级NAS方法,以便在分支号约束下有效地搜索每个层的最佳分支块(ODBB)。我们的实验结果表明,搜索的ODBB可以轻松超越手动各种分支块(DBB),高效培训。代码和型号将越早提供。
translated by 谷歌翻译
Recently, Neural architecture search has achieved great success on classification tasks for mobile devices. The backbone network for object detection is usually obtained on the image classification task. However, the architecture which is searched through the classification task is sub-optimal because of the gap between the task of image and object detection. As while work focuses on backbone network architecture search for mobile device object detection is limited, mainly because the backbone always requires expensive ImageNet pre-training. Accordingly, it is necessary to study the approach of network architecture search for mobile device object detection without expensive pre-training. In this work, we propose a mobile object detection backbone network architecture search algorithm which is a kind of evolutionary optimized method based on non-dominated sorting for NAS scenarios. It can quickly search to obtain the backbone network architecture within certain constraints. It better solves the problem of suboptimal linear combination accuracy and computational cost. The proposed approach can search the backbone networks with different depths, widths, or expansion sizes via a technique of weight mapping, making it possible to use NAS for mobile devices detection tasks a lot more efficiently. In our experiments, we verify the effectiveness of the proposed approach on YoloX-Lite, a lightweight version of the target detection framework. Under similar computational complexity, the accuracy of the backbone network architecture we search for is 2.0% mAP higher than MobileDet. Our improved backbone network can reduce the computational effort while improving the accuracy of the object detection network. To prove its effectiveness, a series of ablation studies have been carried out and the working mechanism has been analyzed in detail.
translated by 谷歌翻译
有条件的生成对冲网络(CGANS)为许多视觉和图形应用程序启用了可控图像合成。然而,最近的CGANS比现代识别CNNS更加计算密集型1-2个数量级。例如,Gaugan每张图像消耗281G Mac,而MobileNet-V3的0.44g Mac相比,使交互式部署难以实现。在这项工作中,我们提出了一种通用压缩框架,用于减少CGAN中发电机的推理时间和模型大小。直接应用现有的压缩方法由于GaN培训的难度和发电机架构的差异而产生差的性能。我们以两种方式解决了这些挑战。首先,为了稳定GaN培训,我们将原型模型的多个中间表示的知识转移到其压缩模型,统一未配对和配对的学习。其次,我们的方法通过神经架构搜索找到高效的架构,而不是重用现有的CNN设计。为了加速搜索过程,我们通过重量共享解耦模型培训并搜索。实验证明了我们在不同监督环境,网络架构和学习方法中的方法的有效性。在没有损失图像质量的情况下,我们将Cycleangan,Pix2pix的Cryclan,Pix2pix的计算计算为12倍,Munit By 29X,Gaugan,通过9倍,为交互式图像合成铺平道路。
translated by 谷歌翻译
最近,社区对模型缩放的关注越来越多,并有助于开发具有广泛尺度的模型家族。当前的方法要么简单地采用单发NAS的方式来构建非结构性和不可缩放的模型家族,要么依靠手动固定的缩放策略来扩展不必要的最佳基础模型。在本文中,我们桥接了两个组件,并将Scalenet提出到共同搜索基础模型和缩放策略,以便缩放大型模型可以具有更有希望的性能。具体来说,我们设计了一个超级植物,以体现具有不同尺寸频谱(例如拖鞋)的模型。然后,可以通过基于马尔可夫链的进化算法与基本模型进行交互学习缩放策略,并概括以开发更大的模型。为了获得一个体面的超级植物,我们设计了一种分层抽样策略,以增强其训练充足并减轻干扰。实验结果表明,我们的缩放网络在各种失败的方面都具有显着的性能优势,但搜索成本至少降低了2.53倍。代码可在https://github.com/luminolx/scalenet上找到。
translated by 谷歌翻译
Slimmable Neural Networks (S-Net) is a novel network which enabled to select one of the predefined proportions of channels (sub-network) dynamically depending on the current computational resource availability. The accuracy of each sub-network on S-Net, however, is inferior to that of individually trained networks of the same size due to its difficulty of simultaneous optimization on different sub-networks. In this paper, we propose Slimmable Pruned Neural Networks (SP-Net), which has sub-network structures learned by pruning instead of adopting structures with the same proportion of channels in each layer (width multiplier) like S-Net, and we also propose new pruning procedures: multi-base pruning instead of one-shot or iterative pruning to realize high accuracy and huge training time saving. We also introduced slimmable channel sorting (scs) to achieve calculation as fast as S-Net and zero padding match (zpm) pruning to prune residual structure in more efficient way. SP-Net can be combined with any kind of channel pruning methods and does not require any complicated processing or time-consuming architecture search like NAS models. Compared with each sub-network of the same FLOPs on S-Net, SP-Net improves accuracy by 1.2-1.5% for ResNet-50, 0.9-4.4% for VGGNet, 1.3-2.7% for MobileNetV1, 1.4-3.1% for MobileNetV2 on ImageNet. Furthermore, our methods outperform other SOTA pruning methods and are on par with various NAS models according to our experimental results on ImageNet. The code is available at https://github.com/hideakikuratsu/SP-Net.
translated by 谷歌翻译
在对象检测模型中,检测骨干机消耗超过一半的整体推理成本。最近的研究试图通过在神经结构搜索(NAS)的帮助下优化骨干架构来降低这一成本。然而,对象检测的现有NAS方法需要数百至数千个GPU小时的搜索,使它们在快节奏的研究和开发中不切实际。在这项工作中,我们提出了一种新的零射NAS方法来解决这个问题。所提出的方法,命名为Zendet,在不训练网络参数的情况下自动设计有效的检测骨干网,从而降低了架构设计成本,几乎归零但提供了最先进的(SOTA)性能。在引擎盖下,Zendet最大化了检测骨干的差分熵,导致对象检测的更好的特征提取器,在相同的计算预算下。在仅为全自动设计的一个GPU日之后,Zendet在多个检测基准数据集上创新了SOTA检测骨干,具有很少的人为干预。与Reset-50个骨干相比,Zendet在Map中使用相同数量的拖波/参数时更好地+ 2.0%,并且在同一地图上的NVIDIA V100速度快1.54倍。稍后将发布代码和预先训练的型号。
translated by 谷歌翻译
混合精确的深神经网络达到了硬件部署所需的能源效率和吞吐量,尤其是在资源有限的情况下,而无需牺牲准确性。但是,不容易找到保留精度的最佳每层钻头精度,尤其是在创建巨大搜索空间的大量模型,数据集和量化技术中。为了解决这一困难,最近出现了一系列文献,并且已经提出了一些实现有希望的准确性结果的框架。在本文中,我们首先总结了文献中通常使用的量化技术。然后,我们对混合精液框架进行了彻底的调查,该调查是根据其优化技术进行分类的,例如增强学习和量化技术,例如确定性舍入。此外,讨论了每个框架的优势和缺点,我们在其中呈现并列。我们最终为未来的混合精液框架提供了指南。
translated by 谷歌翻译
神经网络的结构设计对于深度学习的成功至关重要。尽管大多数先前在进化学习方面的工作旨在直接搜索网络的结构,但在另一个有希望的轨道频道修剪中,几乎没有尝试过,最近在设计有效的深度学习模型方面取得了重大进展。实际上,先前的修剪方法采用人造修剪功能来评估渠道对渠道修剪的重要性,这需要域知识,并且可以是最佳的。为此,我们开创了使用遗传编程(GP)自动发现强度修剪指标的。具体而言,我们制作了一个新颖的设计空间来表达高质量和可转移的修剪功能,从而确保了端到端的演化过程,在该过程中,进化功能不需要手动修改以使其在演变后的传递性。与先前的方法不同,我们的方法可以提供紧凑的修剪网络,以提供有效的推理和新颖的封闭形式的修剪指标,这些指标在数学上可以解释,因此可以推广到不同的修剪任务。尽管演变是在小型数据集上进行的,但我们的功能在应用于更具挑战性的数据集时显示出令人鼓舞的结果,与演化过程中使用的功能不同。例如,在ILSVRC-2012上,进化的函数可获得最新的修剪结果。
translated by 谷歌翻译
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.
translated by 谷歌翻译
Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely. Given the limited hardware resources, existing 3D perception models are not able to recognize small instances (e.g., pedestrians, cyclists) very well due to the low-resolution voxelization and aggressive downsampling. To this end, we propose Sparse Point-Voxel Convolution (SPVConv), a lightweight 3D module that equips the vanilla Sparse Convolution with the high-resolution point-based branch. With negligible overhead, this point-based branch is able to preserve the fine details even from large outdoor scenes. To explore the spectrum of efficient 3D models, we first define a flexible architecture design space based on SPVConv, and we then present 3D Neural Architecture Search (3D-NAS) to search the optimal network architecture over this diverse design space efficiently and effectively. Experimental results validate that the resulting SPVNAS model is fast and accurate: it outperforms the state-of-the-art MinkowskiNet by 3.3%, ranking 1 st on the competitive SemanticKITTI leaderboard upon publication. It also achieves 8× computation reduction and 3× measured speedup over MinkowskiNet still with higher accuracy. Finally, we transfer our method to 3D object detection, and it achieves consistent improvements over the one-stage detection baseline on KITTI.
translated by 谷歌翻译