近年来,深神经网络(DNN)应用的流行和成功促使对DNN压缩的研究,例如修剪和量化。这些技术加速了模型推断,减少功耗,并降低运行DNN所需的硬件的大小和复杂性,而准确性几乎没有损失。但是,由于DNN容易受到对抗输入的影响,因此重要的是要考虑压缩和对抗性鲁棒性之间的关系。在这项工作中,我们研究了几种不规则修剪方案和8位量化产生的模型的对抗性鲁棒性。此外,尽管常规修剪消除了DNN中最不重要的参数,但我们研究了一种非常规修剪方法的效果:根据对抗输入的梯度去除最重要的模型参数。我们称这种方法称贪婪的对抗修剪(GAP),我们发现这种修剪方法会导致模型可抵抗从其未压缩的对应物转移攻击的模型。
translated by 谷歌翻译
深度神经网络近似高度复杂功能的能力是其成功的关键。但是,这种好处是以巨大的模型大小为代价的,这挑战了其在资源受限环境中的部署。修剪是一种用于限制此问题的有效技术,但通常以降低准确性和对抗性鲁棒性为代价。本文解决了这些缺点,并引入了Deadwooding,这是一种新型的全球修剪技术,它利用了Lagrangian双重方法来鼓励模型稀疏性,同时保持准确性并确保鲁棒性。所得模型显示出在鲁棒性和准确性度量方面的最先进研究大大优于最先进的模型。
translated by 谷歌翻译
The compute-intensive nature of neural networks (NNs) limits their deployment in resource-constrained environments such as cell phones, drones, autonomous robots, etc. Hence, developing robust sparse models fit for safety-critical applications has been an issue of longstanding interest. Though adversarial training with model sparsification has been combined to attain the goal, conventional adversarial training approaches provide no formal guarantee that the models would be robust against any rogue samples in a restricted space around a benign sample. Recently proposed verified local robustness techniques provide such a guarantee. This is the first paper that combines the ideas from verified local robustness and dynamic sparse training to develop `SparseVLR'-- a novel framework to search verified locally robust sparse networks. Obtained sparse models exhibit accuracy and robustness comparable to their dense counterparts at sparsity as high as 99%. Furthermore, unlike most conventional sparsification techniques, SparseVLR does not require a pre-trained dense model, reducing the training time by 50%. We exhaustively investigated SparseVLR's efficacy and generalizability by evaluating various benchmark and application-specific datasets across several models.
translated by 谷歌翻译
最近,后门攻击已成为对深神经网络(DNN)模型安全性的新兴威胁。迄今为止,大多数现有研究都集中于对未压缩模型的后门攻击。尽管在实际应用中广泛使用的压缩DNN的脆弱性尚未得到利用。在本文中,我们建议研究和发展针对紧凑型DNN模型(RIBAC)的强大和不可感知的后门攻击。通过对重要设计旋钮进行系统分析和探索,我们提出了一个框架,该框架可以有效地学习适当的触发模式,模型参数和修剪口罩。从而同时达到高触发隐形性,高攻击成功率和高模型效率。跨不同数据集的广泛评估,包括针对最先进的防御机制的测试,证明了RIBAC的高鲁棒性,隐身性和模型效率。代码可从https://github.com/huyvnphan/eccv2022-ribac获得
translated by 谷歌翻译
Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models. However, these tickets are proved to be notrobust to adversarial examples, and even worse than their PLM counterparts. To address this problem, we propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original PLMs. Since the loss is not differentiable for the binary mask, we assign the hard concrete distribution to the masks and encourage their sparsity using a smoothing approximation of L0 regularization.Furthermore, we design an adversarial loss objective to guide the search for robust tickets and ensure that the tickets perform well bothin accuracy and robustness. Experimental results show the significant improvement of the proposed method over previous work on adversarial robustness evaluation.
translated by 谷歌翻译
在部署之前,保护DNN模型的知识产权是至关重要的。到目前为止,提出的方法要么需要更改内部模型参数或机器学习管道,要么无法满足安全性和鲁棒性要求。本文提出了一种轻巧,健壮且安全的黑盒DNN水印协议,该协议利用了加密单向功能以及在训练过程中注入任务钥匙标签 - 标签对。这些对后来用于在测试过程中证明DNN模型所有权。主要功能是证明及其安全性的价值是可衡量的。广泛的实验为各种数据集的图像分类模型以及将它们暴露于各种攻击中,表明它提供了保护的同时,同时保持了足够的安全性和鲁棒性。
translated by 谷歌翻译
边缘设备上卷积神经网络(CNN)的部署受到性能要求和可用处理能力之间的巨大差距的阻碍。尽管最近的研究在开发网络修剪方法以减少CNN的计算开销方面取得了长足的进步,但仍然存在相当大的准确性损失,尤其是在高修剪比率下。质疑为非封闭网络设计的架构可能对修剪网络没有效,我们建议通过定义新的搜索空间和新颖的搜索目标来搜索架构修剪方法。为了改善修剪网络的概括,我们提出了两个新型的原始孔和prunedlinearaare操作。具体而言,这些操作通过正规化修剪网络的目标函数来缓解不稳定梯度的问题。提出的搜索目标使我们能够培训有关修剪权重元素的体系结构参数。定量分析表明,我们的搜索架构优于在CIFAR-10和Imagenet上最先进的修剪网络中使用的体系结构。就硬件效率而言,PR-DARTS将Mobilenet-V2的准确性从73.44%提高到81.35%(+7.91%提高),并且运行3.87 $ \ times $的速度更快。
translated by 谷歌翻译
已知深神经网络(DNN)容易受到对抗性攻击的影响,即对输入的不可察觉的扰动可以误导DNN在清洁图像上培训,以制造错误的预测。为了解决这一目标,对抗性训练是目前最有效的防御方法,通过增强速度设定的训练,在飞行中产生的对抗样本。有趣的是,我们首次发现,在随机初始化的网络中,在没有任何模型训练的随机初始化网络中,第一次发现具有天生稳健性,匹配或超越对抗训练网络的强大准确性的鲁棒准确性,表明对模型权重的对抗训练不是对抗性鲁棒性不可或缺。我们命名为强大的临时票故障票(RST),也是自然效率的那种。不同于流行的彩票假设,既不需要培训原始密集的网络也不需要训练。为了验证和理解这种迷人的发现,我们进一步开展了广泛的实验,以研究不同模型,数据集,稀疏模式和攻击下RST的存在性和性质,绘制关于DNNS鲁棒性与其初始化/过度分辨率之间的关系的洞察。此外,我们确定从同一随机初始化的密集网络绘制的不同稀疏比率的RST之间的差的对抗性转移性,并提出了一种随机切换不同RST之间的随机切换的随机性,作为基于顶部的新型防御方法第一次。我们相信我们对RST的调查结果已经开辟了一个新的视角,以研究模型稳健性并扩大彩票假设。
translated by 谷歌翻译
修剪是一种众所周知的机制,用于降低深度卷积网络的计算成本。然而,研究表明,作为正规化形式修剪的可能性,这减少了过度拟合并改善了泛化。我们证明,这种战略系列提供了额外的益处,超出了计算绩效和泛化。我们的分析表明,来自卷积网络的修剪结构(滤波器和/或层)不仅增加了泛化,而且增加了对抗性图像的鲁棒性(具有内容修改的自然图像)。由于修剪降低了网络容量并提供了正规化,因此可以获得对抗对抗图像的有效工具。与需要对对抗性图像和仔细正规化的培训需要培训的有希望的防御机制,我们表明修剪仅考虑自然图像(例如,标准和低成本训练)。我们在几种对抗攻击和架构上确认这些结果;因此,暗示了作为对抗对抗性图像的新型防御机制修剪的潜力。
translated by 谷歌翻译
Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the insights, we propose a defense, $similarity$-$unpairing$, that fine-tunes on-device models with the objective of reducing the similarity. We evaluated our defense on all the 19 DNNs and found that it reduces the transferability up to 90% and the number of queries required by a factor of 10-100x. Our results suggest that further research is needed on the security (or even privacy) threats caused by publishing those efficient siblings.
translated by 谷歌翻译
Model quantization enables the deployment of deep neural networks under resource-constrained devices. Vector quantization aims at reducing the model size by indexing model weights with full-precision embeddings, i.e., codewords, while the index needs to be restored to 32-bit during computation. Binary and other low-precision quantization methods can reduce the model size up to 32$\times$, however, at the cost of a considerable accuracy drop. In this paper, we propose an efficient framework for ternary quantization to produce smaller and more accurate compressed models. By integrating hyperspherical learning, pruning and reinitialization, our proposed Hyperspherical Quantization (HQ) method reduces the cosine distance between the full-precision and ternary weights, thus reducing the bias of the straight-through gradient estimator during ternary quantization. Compared with existing work at similar compression levels ($\sim$30$\times$, $\sim$40$\times$), our method significantly improves the test accuracy and reduces the model size.
translated by 谷歌翻译
近似计算以其在提高深神经网络(DNN)加速器的能量效率下以轻微精度损耗的成本而闻名。最近,近似组件的不精确性质,例如近似乘数的性质也已经成功地捍卫对DNN模型的对抗攻击。由于近似误差通过DNN层被屏蔽或取消屏蔽,因此这提出了一个关键的研究问题 - 可以近似计算始终为DNN中的对抗发生攻击提供防御,即,他们普遍防御?对此,我们使用最先进的近似乘法器呈现对不同近似DNN加速器(AXDNNS)的广泛的对抗鲁棒性分析。特别是,我们使用MNIST和CIFAR-10数据集评估十对不同AXDNN上的十个对抗的攻击的影响。我们的结果表明,对AXDNN的对抗攻击可能导致53%的精度损失,而相同的攻击可能导致精确的DNN中几乎没有准确性损失(低至0.06%)。因此,近似计算不能被称为对抗对抗攻击的普遍防御策略。
translated by 谷歌翻译
Pruning refers to the elimination of trivial weights from neural networks. The sub-networks within an overparameterized model produced after pruning are often called Lottery tickets. This research aims to generate winning lottery tickets from a set of lottery tickets that can achieve similar accuracy to the original unpruned network. We introduce a novel winning ticket called Cyclic Overlapping Lottery Ticket (COLT) by data splitting and cyclic retraining of the pruned network from scratch. We apply a cyclic pruning algorithm that keeps only the overlapping weights of different pruned models trained on different data segments. Our results demonstrate that COLT can achieve similar accuracies (obtained by the unpruned model) while maintaining high sparsities. We show that the accuracy of COLT is on par with the winning tickets of Lottery Ticket Hypothesis (LTH) and, at times, is better. Moreover, COLTs can be generated using fewer iterations than tickets generated by the popular Iterative Magnitude Pruning (IMP) method. In addition, we also notice COLTs generated on large datasets can be transferred to small ones without compromising performance, demonstrating its generalizing capability. We conduct all our experiments on Cifar-10, Cifar-100 & TinyImageNet datasets and report superior performance than the state-of-the-art methods.
translated by 谷歌翻译
随着移动设备的扩散和物联网,深入学习模型越来越多地部署在具有有限的计算资源和记忆的设备上,并且暴露于对抗性噪声的威胁。这些设备必须使用轻质和稳健性学习深层模型。然而,当前的深度学习解决方案难以学习具有在不降低一个或另一个的情况下拥有这两个性质的模型。众所周知,全连接层有助于卷积神经网络的大部分参数。我们执行完全连接层的可分离结构变换以减少参数,其中全连接层的大规模重量矩阵由几个可分离的小矩阵的张量产品分离。注意,在馈送到完全连接的层之前,诸如图像的数据(例如图像)不再需要扁平化,保留数据的有价值的空间几何信息。此外,为了进一步提高轻质和稳健性,我们提出了稀疏性和可分性条件数的关节约束,其施加在这些可分离基质上。我们评估了MLP,VGG-16和视觉变压器上提出的方法。数据集的实验结果如想象成,SVHN,CiFar-100和CiFAR10,表明我们成功将网络参数量减少了90%,而稳健的精度损耗小于1.5%,这比基于的SOTA方法更好原始完全连接的图层。有趣的是,即使以高压缩率,例如,它也可以实现压倒性的优势,例如,200次。
translated by 谷歌翻译
Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs). However, this approach is typically more expensive than traditional fine-tuning because of the necessity to generate adversarial examples via gradient descent. Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically $0.15\sim0.3$ epochs), far before parameters converge. Inspired by this finding, we dig out robust early-bird tickets (i.e., subnetworks) to develop an efficient adversarial training method: (1) searching for robust tickets with structured sparsity in the early stage; (2) fine-tuning robust tickets in the remaining time. To extract the robust tickets as early as possible, we design a ticket convergence metric to automatically terminate the searching process. Experiments show that the proposed efficient adversarial training method can achieve up to $7\times \sim 13 \times$ training speedups while maintaining comparable or even better robustness compared to the most competitive state-of-the-art adversarial training methods.
translated by 谷歌翻译
深度神经网络已用于多种成功的应用中。但是,由于包含数百万个参数,它们的高度复杂性质导致在延迟需求低的管道中部署期间有问题。结果,更希望获得在推理期间具有相同性能的轻型神经网络。在这项工作中,我们提出了一种基于重量的修剪方法,其中权重根据以前的迭代势头逐渐修剪。神经网络的每个层都根据其相对稀疏性分配了一个重要性值,然后在先前迭代中的重量幅度分配。我们在Alexnet,VGG16和Resnet50等网络上评估了我们的方法,其中包括图像分类数据集,例如CIFAR-10和CIFAR-100。我们发现,在准确性和压缩比方面,结果优于先前的方法。我们的方法能够在两个数据集上获得同一降解的相同降解的15%压缩。
translated by 谷歌翻译
Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, contemporary experience is that the sparse architectures produced by pruning are difficult to train from the start, which would similarly improve training performance.We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that-when trained in isolationreach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective.We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis and the importance of these fortuitous initializations. We consistently find winning tickets that are less than 10-20% of the size of several fully-connected and convolutional feed-forward architectures for MNIST and CIFAR10. Above this size, the winning tickets that we find learn faster than the original network and reach higher test accuracy.
translated by 谷歌翻译
野外的深度学习(DL)的成功采用需要模型:(1)紧凑,(2)准确,(3)强大的分布换档。不幸的是,同时满足这些要求的努力主要是不成功的。这提出了一个重要问题:无法创建紧凑,准确,强大的深神经网络(卡)基础?为了回答这个问题,我们对流行的模型压缩技术进行了大规模分析,该技术揭示了几种有趣模式。值得注意的是,与传统的修剪方法相比(例如,微调和逐渐修剪),我们发现“彩票式风格”方法令人惊讶地用于生产卡,包括二进制牌。具体而言,我们能够创建极其紧凑的卡,与其较大的对应物相比,具有类似的测试精度和匹配(或更好)的稳健性 - 仅通过修剪和(可选)量化。利用卡的紧凑性,我们开发了一种简单的域 - 自适应测试时间合并方法(卡片 - 甲板),它使用门控模块根据与测试样本的光谱相似性动态地选择相应的卡片。该拟议的方法建立了一个“赢得胜利”的卡片,即在CiFar-10-C精度(即96.8%标准和92.75%的鲁棒)和CiFar-100- C精度(80.6%标准和71.3%的稳健性),内存使用率比非压缩基线(Https://github.com/robustbench/robustbench提供的预制卡和卡片 - 甲板)。最后,我们为我们的理论支持提供了理论支持经验研究结果。
translated by 谷歌翻译
对抗性训练(AT)是针对对抗分类系统的对抗性攻击的简单而有效的防御,这是基于增强训练设置的攻击,从而最大程度地提高了损失。但是,AT作为视频分类的辩护的有效性尚未得到彻底研究。我们的第一个贡献是表明,为视频生成最佳攻击需要仔细调整攻击参数,尤其是步骤大小。值得注意的是,我们证明最佳步长随攻击预算线性变化。我们的第二个贡献是表明,在训练时间使用较小(次优的)攻击预算会导致测试时的性能更加强大。根据这些发现,我们提出了三个防御攻击预算的攻击的防御。自适应AT的第一个技术是一种技术,该技术是从随着训练迭代进行的。第二个课程是一项技术,随着训练的迭代进行,攻击预算的增加。第三个生成的AT,与deno的生成对抗网络一起,以提高稳健的性能。 UCF101数据集上的实验表明,所提出的方法改善了针对多种攻击类型的对抗性鲁棒性。
translated by 谷歌翻译
本文研究了静态稀疏对训练有素网络对扰动,数据腐败和对抗性示例的鲁棒性的影响。我们表明,通过增加网络宽度和深度,同时保持网络容量固定,稀疏网络始终匹配,并且通常优于其最初密集的版本,从而达到了一定的稀疏性。由于网络层之间的连通性松动而导致非常高的稀疏性同时下降。我们的发现表明,文献中观察到的网络压缩引起的快速鲁棒性下降是由于网络容量降低而不是稀疏性。
translated by 谷歌翻译