当前的深度神经网络(DNN)容易受到对抗性攻击的影响,在这种攻击中,对输入的对抗扰动可以改变或操纵分类。为了防御此类攻击,已证明一种有效而流行的方法,称为对抗性训练(AT),可通过一种最小的最大强大的训练方法来减轻对抗攻击的负面影响。尽管有效,但尚不清楚它是否可以成功地适应分布式学习环境。分布式优化对多台机器的功能使我们能够扩展大型型号和数据集的强大训练。我们提出了这一点,我们提出了分布式的对抗训练(DAT),这是在多台机器上实施的大批量对抗训练框架。我们证明DAT是一般的,它支持对标记和未标记的数据,多种类型的攻击生成方法以及梯度压缩操作的培训。从理论上讲,我们在优化理论中的标准条件下提供了DAT与一般非凸面设置中一阶固定点的收敛速率。从经验上讲,我们证明DAT要么匹配或胜过最先进的稳健精度,并实现了优美的训练速度(例如,在ImageNet下的Resnet-50上)。代码可在https://github.com/dat-2022/dat上找到。
translated by 谷歌翻译
对抗性培训(AT)已成为一种广泛认可的防御机制,以提高深度神经网络对抗对抗攻击的鲁棒性。它解决了最小的最大优化问题,其中最小化器(即,后卫)寻求稳健的模型,以最小化由最大化器(即,攻击者)制成的对抗示例存在的最坏情况训练损失。然而,Min-Max的性质在计算密集并因此难以扩展。同时,快速算法,实际上,许多最近改进的算法,通过替换基于简单的单次梯度标志的攻击生成步骤来简化基于最大化步骤的最小值。虽然易于实施,快速缺乏理论保证,其实际表现可能是不令人满意的,患有强大的对手训练时的鲁棒性灾难性过度。在本文中,我们从双级优化(BLO)的角度来看,旨在快速设计。首先,首先进行关键观察,即快速at的最常用的算法规范等同于使用一些梯度下降型算法来解决涉及符号操作的双级问题。然而,标志操作的离散性使得难以理解算法的性能。基于上述观察,我们提出了一种新的遗传性双层优化问题,设计和分析了一组新的算法(快速蝙蝠)。 FAST-BAT能够捍卫基于符号的投影梯度下降(PGD)攻击,而无需调用任何渐变标志方法和明确的鲁棒正则化。此外,我们经验证明,通过在不诱导鲁棒性灾难性过度的情况下实现卓越的模型稳健性,或患有任何标准精度损失的稳健性,我们的方法优于最先进的快速基线。
translated by 谷歌翻译
Adversarial training, in which a network is trained on adversarial examples, is one of the few defenses against adversarial attacks that withstands strong attacks. Unfortunately, the high cost of generating strong adversarial examples makes standard adversarial training impractical on large-scale problems like ImageNet. We present an algorithm that eliminates the overhead cost of generating adversarial examples by recycling the gradient information computed when updating model parameters.Our "free" adversarial training algorithm achieves comparable robustness to PGD adversarial training on the CIFAR-10 and CIFAR-100 datasets at negligible additional cost compared to natural training, and can be 7 to 30 times faster than other strong adversarial training methods. Using a single workstation with 4 P100 GPUs and 2 days of runtime, we can train a robust model for the large-scale ImageNet classification task that maintains 40% accuracy against PGD attacks. The code is available at https://github.com/ashafahi/free_adv_train.
translated by 谷歌翻译
现代深度学习模型通常在分布式机器集合中并行培训,以减少训练时间。在这种情况下,机器之间模型更新的通信变成了一个重要的性能瓶颈,并且已经提出了各种有损的压缩技术来减轻此问题。在这项工作中,我们介绍了一种新的,简单但理论上和实践上有效的压缩技术:自然压缩(NC)。我们的技术分别应用于要进行压缩的更新向量的所有条目,并通过随机舍入到两个的(负或正)两种功能,可以通过忽略Mantissa来以“自然”方式计算。我们表明,与没有压缩相比,NC将压缩向量的第二刻增加不超过微小因子$ \ frac {9} {8} $,这意味着NC对流行训练算法的收敛速度的影响,例如分布式SGD,可以忽略不计。但是,NC启用的通信节省是可观的,导致$ 3 $ - $ 4 \ times $ $改善整体理论运行时间。对于需要更具侵略性压缩的应用,我们将NC推广到自然抖动,我们证明这比常见的随机抖动技术要好得多。我们的压缩操作员可以自行使用,也可以与现有操作员结合使用,从而产生更具侵略性的结合效果,并在理论和实践中提供新的最先进。
translated by 谷歌翻译
Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with = 8/255 in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at = 2/255 in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds. Finally, we identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail. All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.
translated by 谷歌翻译
深度神经网络很容易被称为对抗攻击的小扰动都愚弄。对抗性培训(AT)是一种近似解决了稳健的优化问题,以最大限度地减少最坏情况损失,并且被广泛认为是对这种攻击的最有效的防御。由于产生了强大的对抗性示例的高计算时间,已经提出了单步方法来减少培训时间。然而,这些方法遭受灾难性的过度装备,在训练期间侵犯准确度下降。虽然提出了改进,但它们增加了培训时间和稳健性远非多步骤。我们为FW优化(FW-AT)开发了对抗的对抗培训的理论框架,揭示了损失景观与$ \ ell_2 $失真之间的几何连接。我们分析地表明FW攻击的高变形相当于沿攻击路径的小梯度变化。然后在各种深度神经网络架构上进行实验证明,$ \ ell \ infty $攻击对抗强大的模型实现近乎最大的$ \ ell_2 $失真,而标准网络具有较低的失真。此外,实验表明,灾难性的过度拟合与FW攻击的低变形强烈相关。为了展示我们理论框架的效用,我们开发FW-AT-Adap,这是一种新的逆势训练算法,它使用简单的失真度量来调整攻击步骤的数量,以提高效率而不会影响鲁棒性。 FW-AT-Adapt提供培训时间以单步快速分配方法,并改善了在白色盒子和黑匣子设置中的普发内精度的最小损失和多步PGD之间的差距。
translated by 谷歌翻译
最大限度的训练原则,最大限度地减少最大的对抗性损失,也称为对抗性培训(AT),已被证明是一种提高对抗性鲁棒性的最先进的方法。尽管如此,超出了在对抗环境中尚未经过严格探索的最小最大优化。在本文中,我们展示了如何利用多个领域的最小最大优化的一般框架,以推进不同类型的对抗性攻击的设计。特别是,给定一组风险源,最小化最坏情况攻击损失可以通过引入在域集的概率单纯x上最大化的域权重来重新重整为最小最大问题。我们在三次攻击生成问题中展示了这个统一的框架 - 攻击模型集合,在多个输入下设计了通用扰动,并制作攻击对数据转换的弹性。广泛的实验表明,我们的方法导致对现有的启发式策略以及对培训的最先进的防御方法而言,鲁棒性改善,培训对多种扰动类型具有稳健。此外,我们发现,从我们的MIN-MAX框架中学到的自调整域权重可以提供整体工具来解释跨域攻击难度的攻击水平。代码可在https://github.com/wangjksjtu/minmaxsod中获得。
translated by 谷歌翻译
到目前为止对抗训练是抵御对抗例子的最有效的策略。然而,由于每个训练步骤中的迭代对抗性攻击,它遭受了高的计算成本。最近的研究表明,通过随机初始化执行单步攻击,可以实现快速的对抗训练。然而,这种方法仍然落后于稳定性和模型稳健性的最先进的对手训练算法。在这项工作中,我们通过观察随机平滑的随机初始化来更好地优化内部最大化问题,对快速对抗培训进行新的理解。在这种新的视角之后,我们还提出了一种新的初始化策略,向后平滑,进一步提高单步强大培训方法的稳定性和模型稳健性。多个基准测试的实验表明,我们的方法在使用更少的训练时间(使用相同的培训计划时,使用更少的培训时间($ \ sim $ 3x改进)时,我们的方法达到了类似的模型稳健性。
translated by 谷歌翻译
改善深度神经网络(DNN)对抗对抗示例的鲁棒性是安全深度学习的重要而挑战性问题。跨越现有的防御技术,具有预计梯度体面(PGD)的对抗培训是最有效的。对手训练通过最大化分类丢失,通过最大限度地减少从内在最大化生成的逆势示例的丢失来解决\ excepitient {内部最大化}生成侵略性示例的初始最大优化问题。 。因此,衡量内部最大化的衡量标准是如何对对抗性培训至关重要的。在本文中,我们提出了这种标准,即限制优化(FOSC)的一阶静止条件,以定量评估内部最大化中发现的对抗性实例的收敛质量。通过FOSC,我们发现,为了确保更好的稳健性,必须在培训的\ Texit {稍后的阶段}中具有更好的收敛质量的对抗性示例。然而,在早期阶段,高收敛质量的对抗例子不是必需的,甚至可能导致稳健性差。基于这些观察,我们提出了一种\ Texit {动态}培训策略,逐步提高产生的对抗性实例的收敛质量,这显着提高了对抗性培训的鲁棒性。我们的理论和经验结果表明了该方法的有效性。
translated by 谷歌翻译
虽然多步逆势培训被广泛流行作为对抗强烈的对抗攻击的有效防御方法,但其计算成本与标准培训相比,其计算成本是众所周知的。已经提出了几种单步侵权培训方法来减轻上述开销费用;但是,根据优化设置,它们的性能并不能充分可靠。为了克服这些限制,我们偏离了现有的基于输入空间的对抗性培训制度,并提出了一种单步潜在培训方法(SLAT),其利用潜在的代表梯度作为潜在的对抗扰动。我们证明,与所采用的潜伏扰动,恢复局部线性度并确保与现有的单步逆势训练方法相比,恢复局部线性度并确保可靠性的特征梯度的L1规范。因为潜伏的扰动基于可以在输入梯度计算过程中免费获得的潜在表示的梯度,所以所提出的方法与快速梯度标志方法相当成本。实验结果表明,尽管其结构简单,但优于最先进的加速的对抗训练方法。
translated by 谷歌翻译
深度学习在广泛的AI应用方面取得了有希望的结果。较大的数据集和模型一致地产生更好的性能。但是,我们一般花费更长的培训时间,以更多的计算和沟通。在本调查中,我们的目标是在模型精度和模型效率方面提供关于大规模深度学习优化的清晰草图。我们调查最常用于优化的算法,详细阐述了大批量培训中出现的泛化差距的可辩论主题,并审查了解决通信开销并减少内存足迹的SOTA策略。
translated by 谷歌翻译
神经网络容易受到对抗性攻击的攻击:在其输入中添加精心设计,不可察觉的扰动可以改变其输出。对抗训练是针对此类攻击的训练强大模型的最有效方法之一。不幸的是,这种方法比神经网络的香草培训要慢得多,因为它需要在每次迭代时为整个培训数据构建对抗性示例。通过利用核心选择理论,我们展示了如何选择一小部分训练数据提供了一种原则性的方法来降低健壮训练的时间复杂性。为此,我们首先为对抗核心选择提供收敛保证。特别是,我们表明收敛界限直接与我们的核心在整个训练数据中计算出的梯度的距离如何。在我们的理论分析的激励下,我们建议使用此梯度近似误差作为对抗核心选择目标,以有效地减少训练集大小。建造后,我们在培训数据的这一子集上进行对抗训练。与现有方法不同,我们的方法可以适应各种培训目标,包括交易,$ \ ell_p $ -pgd和感知性对手培训。我们进行了广泛的实验,以证明我们的进近可以使对抗性训练加快2-3次,同时在清洁和稳健的精度中略有降解。
translated by 谷歌翻译
It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically, that such practices surprisingly do not unduly harm the generalization performance of the classifier. In this paper, we empirically study this phenomenon in the setting of adversarially trained deep networks, which are trained to minimize the loss under worst-case adversarial perturbations. We find that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training across multiple datasets (SVHN, CIFAR-10, CIFAR-100, and ImageNet) and perturbation models ( ∞ and 2 ). Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stopping. We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting. Finally, we study several classical and modern deep learning remedies for overfitting, including regularization and data augmentation, and find that no approach in isolation improves significantly upon the gains achieved by early stopping. All code for reproducing the experiments as well as pretrained model weights and training logs can be found at https://github.com/ locuslab/robust_overfitting.
translated by 谷歌翻译
The study on improving the robustness of deep neural networks against adversarial examples grows rapidly in recent years. Among them, adversarial training is the most promising one, which flattens the input loss landscape (loss change with respect to input) via training on adversarially perturbed examples. However, how the widely used weight loss landscape (loss change with respect to weight) performs in adversarial training is rarely explored. In this paper, we investigate the weight loss landscape from a new perspective, and identify a clear correlation between the flatness of weight loss landscape and robust generalization gap. Several well-recognized adversarial training improvements, such as early stopping, designing new objective functions, or leveraging unlabeled data, all implicitly flatten the weight loss landscape. Based on these observations, we propose a simple yet effective Adversarial Weight Perturbation (AWP) to explicitly regularize the flatness of weight loss landscape, forming a double-perturbation mechanism in the adversarial training framework that adversarially perturbs both inputs and weights. Extensive experiments demonstrate that AWP indeed brings flatter weight loss landscape and can be easily incorporated into various existing adversarial training methods to further boost their adversarial robustness.
translated by 谷歌翻译
对比度学习(CL)可以通过在其顶部的线性分类器上学习更广泛的特征表示并实现下游任务的最先进的性能。然而,由于对抗性稳健性在图像分类中变得至关重要,但仍然不清楚CL是否能够为下游任务保留鲁棒性。主要挑战是,在自我监督的预押率+监督的FineTuning范式中,由于学习任务不匹配从预先追溯到Fineetuning,对抗性鲁棒性很容易被遗忘。我们称之为挑战“跨任务稳健性转移性”。为了解决上述问题,在本文中,我们通过稳健性增强的镜头重新审视并提前CL原理。我们展示了(1)对比视图的设计事项:图像的高频分量有利于提高模型鲁棒性; (2)使用伪监督刺激(例如,诉诸特征聚类)增强CL,有助于保持稳健性而不会忘记。配备了我们的新设计,我们提出了一种新的对抗对比预制框架的advcl。我们表明Advcl能够增强跨任务稳健性转移性,而不会损失模型精度和芬降效率。通过彻底的实验研究,我们展示了Advcl优于跨多个数据集(CiFar-10,CiFar-100和STL-10)和FineTuning方案的最先进的自我监督的自我监督学习方法(线性评估和满模型fineetuning)。
translated by 谷歌翻译
我们理论上和经验地证明,对抗性鲁棒性可以显着受益于半体验学习。从理论上讲,我们重新审视了Schmidt等人的简单高斯模型。这显示了标准和稳健分类之间的示例复杂性差距。我们证明了未标记的数据桥接这种差距:简单的半体验学习程序(自我训练)使用相同数量的达到高标准精度所需的标签实现高的强大精度。经验上,我们增强了CiFar-10,使用50万微小的图像,使用了8000万微小的图像,并使用强大的自我训练来优于最先进的鲁棒精度(i)$ \ ell_ infty $鲁棒性通过对抗培训和(ii)认证$ \ ell_2 $和$ \ ell_ \ infty $鲁棒性通过随机平滑的几个强大的攻击。在SVHN上,添加DataSet自己的额外训练集,删除的标签提供了4到10个点的增益,在使用额外标签的1点之内。
translated by 谷歌翻译
Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. SIGNSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative 1 / 2 geometry of gradients, noise and curvature informs whether SIGNSGD or SGD is theoretically better suited to a particular problem. On the practical side we find that the momentum counterpart of SIGNSGD is able to match the accuracy and convergence speed of ADAM on deep Imagenet models. We extend our theory to the distributed setting, where the parameter server uses majority vote to aggregate gradient signs from each worker enabling 1-bit compression of worker-server communication in both directions. Using a theorem by Gauss (1823) we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD. Thus, there is great promise for sign-based optimisation schemes to achieve fast communication and fast convergence. Code to reproduce experiments is to be found at https://github.com/jxbz/signSGD.
translated by 谷歌翻译
我们表明,当考虑到图像域$ [0,1] ^ D $时,已建立$ L_1 $ -Projected梯度下降(PGD)攻击是次优,因为它们不认为有效的威胁模型是交叉点$ l_1 $ -ball和$ [0,1] ^ d $。我们研究了这种有效威胁模型的最陡渐进步骤的预期稀疏性,并表明该组上的确切投影是计算可行的,并且产生更好的性能。此外,我们提出了一种自适应形式的PGD,即使具有小的迭代预算,这也是非常有效的。我们的结果$ l_1 $ -apgd是一个强大的白盒攻击,表明先前的作品高估了他们的$ l_1 $ -trobustness。使用$ l_1 $ -apgd for vercersarial培训,我们获得一个强大的分类器,具有sota $ l_1 $ -trobustness。最后,我们将$ l_1 $ -apgd和平方攻击的适应组合到$ l_1 $ to $ l_1 $ -autoattack,这是一个攻击的集合,可靠地评估$ l_1 $ -ball与$的威胁模型的对抗鲁棒性进行对抗[ 0,1] ^ d $。
translated by 谷歌翻译
尽管深层神经网络在各种任务中取得了巨大的成功,但它们对不可察觉的对抗性扰动的脆弱性阻碍了他们在现实世界中的部署。最近,与随机合奏的作品相对于经过最小的计算开销的标准对手训练(AT)模型,对对抗性训练(AT)模型的对抗性鲁棒性有了显着改善,这使它们成为安全临界资源限制应用程序的有前途解决方案。但是,这种令人印象深刻的表现提出了一个问题:这些稳健性是由随机合奏提供的吗?在这项工作中,我们从理论和经验上都解决了这个问题。从理论上讲,我们首先确定通常采用的鲁棒性评估方法(例如自适应PGD)在这种情况下提供了错误的安全感。随后,我们提出了一种理论上有效的对抗攻击算法(ARC),即使在自适应PGD无法做到这一点的情况下,也能妥协随机合奏。我们在各种网络体系结构,培训方案,数据集和规范上进行全面的实验,以支持我们的主张,并经验证明,随机合奏实际上比在模型上更容易受到$ \ ell_p $结合的对抗性扰动的影响。我们的代码可以在https://github.com/hsndbk4/arc上找到。
translated by 谷歌翻译
Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples-inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. 1
translated by 谷歌翻译