Shift Invariance是CNN的关键属性,可提高分类性能。然而,我们表明,与循环偏移的不变性也可能导致对对抗性攻击的更大敏感性。我们首先在使用换档不变线性分类器时表征类之间的余量。我们表明边际只能依赖于信号的DC分量。然后,使用关于无限宽网络的结果,我们显示在一些简单的情况下,完全连接和换档不变神经网络产生线性决策边界。使用这一点,我们证明了神经网络中的换档不变性为两个类的简单情况产生了对手示例,每个案例由灰色背景上的黑色或白点组成的单个图像。这不仅仅是一种好奇心;我们凭经验显示,使用真实的数据集和现实的架构,换档不变性降低了对抗性的鲁棒性。最后,我们描述了使用合成数据来探测这种连接源的初始实验。
translated by 谷歌翻译
对抗性攻击对神经网络来说仍然是一个重大挑战。最近的工作表明,对抗性扰动通常包含高频特征,但这种现象的根本原因仍然未知。灵感来自于线性全宽卷积模型的理论工作,我们假设当前神经网络中常用的本地(即界宽)卷积操作被隐含地偏置以学习高频特征,并且这是根本原因之一高频对抗例。为了测试这一假设,我们在空间和频率域中分析了线性和非线性架构对学习特征和对抗扰动的隐含偏差的影响。我们发现高频对抗性扰动批判性地取决于卷积操作,因为本地互联网的空间有限的性质引起了对高频特征的隐含偏差。后者的解释涉及傅立叶不确定性原理:空间限制(空间域中的本地)滤波器也不能是频率限制(频域中的本地)。此外,使用较大的卷积核尺寸或避免卷曲(例如,通过使用视觉变压器架构)显着降低了这种高频偏差,但不是对攻击的总体易感性。期待着,我们的工作强烈建议了解和控制架构的隐含偏差对于实现对抗性鲁棒性至关重要。
translated by 谷歌翻译
对抗性训练及其变体已成为使用神经网络实现对抗性稳健分类的普遍方法。但是,它的计算成本增加,以及标准性能和稳健性能之间的显着差距阻碍了进步,并提出了我们是否可以做得更好的问题。在这项工作中,我们退后一步,问:模型可以通过适当优化的集合通过标准培训来实现鲁棒性吗?为此,我们设计了一种用于鲁棒分类的元学习方法,该方法以原则性的方式在部署之前优化了数据集,并旨在有效地删除数据的非稳定部分。我们将优化方法作为内核回归的多步PGD程序进行了,其中一类核描述了无限宽的神经网(神经切线核-NTKS)。 MNIST和CIFAR-10的实验表明,当在内核回归分类器和神经网络中部署时,我们生成的数据集对PGD攻击都非常鲁棒性。但是,这种鲁棒性有些谬误,因为替代性攻击设法欺骗了模型,我们发现文献中以前的类似作品也是如此。我们讨论了这一点的潜在原因,并概述了进一步的研究途径。
translated by 谷歌翻译
尽管通常认为在高维度中学习受到维度的诅咒,但现代的机器学习方法通​​常具有惊人的力量,可以解决广泛的挑战性现实世界学习问题而无需使用大量数据。这些方法如何打破这种诅咒仍然是深度学习理论中的一个基本开放问题。尽管以前的努力通过研究数据(D),模型(M)和推理算法(i)作为独立模块来研究了这个问题,但在本文中,我们将三胞胎(D,M,I)分析为集成系统和确定有助于减轻维度诅咒的重要协同作用。我们首先研究了与各种学习算法(M,i)相关的基本对称性,重点是深度学习中的四个原型体系结构:完全连接的网络(FCN),本地连接的网络(LCN)和卷积网络,而无需合并(有和没有合并)( GAP/VEC)。我们发现,当这些对称性与数据分布的对称性兼容时,学习是最有效的,并且当(d,m,i)三重态的任何成员不一致或次优时,性能会显着恶化。
translated by 谷歌翻译
Machine learning models are often susceptible to adversarial perturbations of their inputs. Even small perturbations can cause state-of-the-art classifiers with high "standard" accuracy to produce an incorrect prediction with high confidence. To better understand this phenomenon, we study adversarially robust learning from the viewpoint of generalization. We show that already in a simple natural data model, the sample complexity of robust learning can be significantly larger than that of "standard" learning. This gap is information theoretic and holds irrespective of the training algorithm or the model family. We complement our theoretical results with experiments on popular image classification datasets and show that a similar gap exists here as well. We postulate that the difficulty of training robust classifiers stems, at least partially, from this inherently larger sample complexity.
translated by 谷歌翻译
Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples-inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. 1
translated by 谷歌翻译
对抗性的鲁棒性已经成为深度学习的核心目标,无论是在理论和实践中。然而,成功的方法来改善对抗的鲁棒性(如逆势训练)在不受干扰的数据上大大伤害了泛化性能。这可能会对对抗性鲁棒性如何影响现实世界系统的影响(即,如果它可以提高未受干扰的数据的准确性),许多人可能选择放弃鲁棒性)。我们提出内插对抗培训,该培训最近雇用了在对抗培训框架内基于插值的基于插值的培训方法。在CiFar -10上,对抗性训练增加了标准测试错误(当没有对手时)从4.43%到12.32%,而我们的内插对抗培训我们保留了对抗性的鲁棒性,同时实现了仅6.45%的标准测试误差。通过我们的技术,强大模型标准误差的相对增加从178.1%降至仅为45.5%。此外,我们提供内插对抗性培训的数学分析,以确认其效率,并在鲁棒性和泛化方面展示其优势。
translated by 谷歌翻译
在过去的几年中,卷积神经网络(CNN)一直是广泛的计算机视觉任务中的主导神经架构。从图像和信号处理的角度来看,这一成功可能会令人惊讶,因为大多数CNN的固有空间金字塔设计显然违反了基本的信号处理法,即在其下采样操作中对定理进行采样。但是,由于不良的采样似乎不影响模型的准确性,因此在模型鲁棒性开始受到更多关注之前,该问题已被广泛忽略。最近的工作[17]在对抗性攻击和分布变化的背景下,毕竟表明,CNN的脆弱性与不良下降采样操作引起的混叠伪像之间存在很强的相关性。本文以这些发现为基础,并引入了一个可混合的免费下采样操作,可以轻松地插入任何CNN体系结构:频lowcut池。我们的实验表明,结合简单而快速的FGSM对抗训练,我们的超参数无操作员显着提高了模型的鲁棒性,并避免了灾难性的过度拟合。
translated by 谷歌翻译
We propose a method to learn deep ReLU-based classifiers that are provably robust against normbounded adversarial perturbations on the training data. For previously unseen examples, the approach is guaranteed to detect all adversarial examples, though it may flag some non-adversarial examples as well. The basic idea is to consider a convex outer approximation of the set of activations reachable through a norm-bounded perturbation, and we develop a robust optimization procedure that minimizes the worst case loss over this outer region (via a linear program). Crucially, we show that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss. The end result is that by executing a few more forward and backward passes through a slightly modified version of the original network (though possibly with much larger batch sizes), we can learn a classifier that is provably robust to any norm-bounded adversarial attack. We illustrate the approach on a number of tasks to train classifiers with robust adversarial guarantees (e.g. for MNIST, we produce a convolutional classifier that provably has less than 5.8% test error for any adversarial attack with bounded ∞ norm less than = 0.1), and code for all experiments is available at http://github.com/ locuslab/convex_adversarial.
translated by 谷歌翻译
最近的工作表明,不同体系结构的卷积神经网络学会按照相同的顺序对图像进行分类。为了理解这种现象,我们重新审视了过度参数的深度线性网络模型。我们的分析表明,当隐藏层足够宽时,该模型参数的收敛速率沿数据的较大主组件的方向呈指数级数,该方向由由相应的奇异值控制的速率。我们称这种收敛模式主成分偏差(PC偏置)。从经验上讲,我们展示了PC偏差如何简化线性和非线性网络的学习顺序,在学习的早期阶段更为突出。然后,我们将结果与简单性偏见进行比较,表明可以独立看到这两个偏见,并以不同的方式影响学习顺序。最后,我们讨论了PC偏差如何解释早期停止及其与PCA的联系的一些好处,以及为什么深网与随机标签更慢地收敛。
translated by 谷歌翻译
由明确的反对派制作的对抗例子在机器学习中引起了重要的关注。然而,潜在虚假朋友带来的安全风险基本上被忽视了。在本文中,我们揭示了虚伪的例子的威胁 - 最初被错误分类但是虚假朋友扰乱的投入,以强迫正确的预测。虽然这种扰动的例子似乎是无害的,但我们首次指出,它们可能是恶意地用来隐瞒评估期间不合格(即,不如所需)模型的错误。一旦部署者信任虚伪的性能并在真实应用程序中应用“良好的”模型,即使在良性环境中也可能发生意外的失败。更严重的是,这种安全风险似乎是普遍存在的:我们发现许多类型的不合标准模型易受多个数据集的虚伪示例。此外,我们提供了第一次尝试,以称为虚伪风险的公制表征威胁,并试图通过一些对策来规避它。结果表明对策的有效性,即使在自适应稳健的培训之后,风险仍然是不可忽视的。
translated by 谷歌翻译
Deep neural networks may easily memorize noisy labels present in real-world data, which degrades their ability to generalize. It is therefore important to track and evaluate the robustness of models against noisy label memorization. We propose a metric, called susceptibility, to gauge such memorization for neural networks. Susceptibility is simple and easy to compute during training. Moreover, it does not require access to ground-truth labels and it only uses unlabeled data. We empirically show the effectiveness of our metric in tracking memorization on various architectures and datasets and provide theoretical insights into the design of the susceptibility metric. Finally, we show through extensive experiments on datasets with synthetic and real-world label noise that one can utilize susceptibility and the overall training accuracy to distinguish models that maintain a low memorization on the training set and generalize well to unseen clean data.
translated by 谷歌翻译
现有针对对抗性示例(例如对抗训练)的防御能力通常假设对手将符合特定或已知的威胁模型,例如固定预算内的$ \ ell_p $扰动。在本文中,我们关注的是在训练过程中辩方假设的威胁模型中存在不匹配的情况,以及在测试时对手的实际功能。我们问一个问题:学习者是否会针对特定的“源”威胁模型进行训练,我们什么时候可以期望鲁棒性在测试时间期间概括为更强大的未知“目标”威胁模型?我们的主要贡献是通过不可预见的对手正式定义学习和概括的问题,这有助于我们从常规的对手的传统角度来理解对抗风险的增加。应用我们的框架,我们得出了将源和目标威胁模型之间的概括差距与特征提取器变化相关联的概括,该限制衡量了在给定威胁模型中提取的特征之间的预期最大差异。基于我们的概括结合,我们提出了具有变化正则化(AT-VR)的对抗训练,该训练在训练过程中降低了特征提取器在源威胁模型中的变化。我们从经验上证明,与标准的对抗训练相比,AT-VR可以改善测试时间内的概括,从而无法预见。此外,我们将变异正则化与感知对抗训练相结合[Laidlaw等。 2021]以实现不可预见的攻击的最新鲁棒性。我们的代码可在https://github.com/inspire-group/variation-regularization上公开获取。
translated by 谷歌翻译
Modern convolutional networks are not shiftinvariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and averagepooling, ignore the sampling theorem. The wellknown signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks degrades performance; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
translated by 谷歌翻译
尽管使用对抗性训练捍卫深度学习模型免受对抗性扰动的经验成功,但到目前为止,仍然不清楚对抗性扰动的存在背后的原则是什么,而对抗性培训对神经网络进行了什么来消除它们。在本文中,我们提出了一个称为特征纯化的原则,在其中,我们表明存在对抗性示例的原因之一是在神经网络的训练过程中,在隐藏的重量中积累了某些小型密集混合物;更重要的是,对抗训练的目标之一是去除此类混合物以净化隐藏的重量。我们介绍了CIFAR-10数据集上的两个实验,以说明这一原理,并且一个理论上的结果证明,对于某些自然分类任务,使用随机初始初始化的梯度下降训练具有RELU激活的两层神经网络确实满足了这一原理。从技术上讲,我们给出了我们最大程度的了解,第一个结果证明,以下两个可以同时保持使用RELU激活的神经网络。 (1)对原始数据的训练确实对某些半径的小对抗扰动确实不舒适。 (2)即使使用经验性扰动算法(例如FGM),实际上也可以证明对对抗相同半径的任何扰动也可以证明具有强大的良好性。最后,我们还证明了复杂性的下限,表明该网络的低复杂性模型,例如线性分类器,低度多项式或什至是神经切线核,无论使用哪种算法,都无法防御相同半径的扰动训练他们。
translated by 谷歌翻译
Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples. This includes distribution shifts, outliers, and adversarial examples. To address these issues, we propose Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations. Manifold Mixup leverages semantic interpolations as additional training signal, obtaining neural networks with smoother decision boundaries at multiple levels of representation. As a result, neural networks trained with Manifold Mixup learn class-representations with fewer directions of variance. We prove theory on why this flattening happens under ideal conditions, validate it on practical situations, and connect it to previous works on information theory and generalization. In spite of incurring no significant computation and being implemented in a few lines of code, Manifold Mixup improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.
translated by 谷歌翻译
We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.
translated by 谷歌翻译
The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the l p -norms for p ∈ {1, 2, ∞} aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to stateof-the-art attacks which are partially specialized to one l p -norm, and is robust to the phenomenon of gradient masking.
translated by 谷歌翻译
理解为什么深网络可以在大尺寸中对数据进行分类仍然是一个挑战。已经提出了它们通过变得稳定的差异术,但现有的经验测量值得支持它通常不是这种情况。我们通过定义弥散术的最大熵分布来重新审视这个问题,这允许研究给定规范的典型的扩散术。我们确认对基准数据集的稳定性与基准数据集的性能没有强烈关联。相比之下,我们发现,对于普通转换的稳定性,R_F $的稳定性与测试错误$ \ epsilon_t $相比。在初始化时,它是初始化的统一,但在最先进的架构培训期间减少了几十年。对于CiFar10和15名已知的架构,我们发现$ \ epsilon_t \约0.2 \ sqrt {r_f} $,表明获得小$ r_f $非常重要,无法实现良好的性能。我们研究R_F $如何取决于培训集的大小,并将其与简单的不变学习模型进行比较。
translated by 谷歌翻译
小组卷积神经网络(G-CNN)是卷积神经网络(CNN)的概括,通过在其体系结构中明确编码旋转和排列,在广泛的技术应用中脱颖而出。尽管G-CNN的成功是由它们的\ emph {emplapicit}对称偏见驱动的,但最近的一项工作表明,\ emph {隐式}对特定体系结构的偏差是理解过度参数化神经网的概​​括的关键。在这种情况下,我们表明,通过梯度下降训练了二进制分类的$ L $ layer全宽线性G-CNN,将二进制分类收敛到具有低级别傅立叶矩阵系数的解决方案,并由$ 2/l $ -schatten矩阵规范正规化。我们的工作严格概括了先前对线性CNN的隐性偏差对线性G-CNN的隐性分析,包括所有有限组,包括非交换组的挑战性设置(例如排列),以及无限组的频段限制G-CNN 。我们通过在各个组上实验验证定理,并在经验上探索更现实的非线性网络,该网络在局部捕获了相似的正则化模式。最后,我们通过不确定性原理提供了对傅立叶空间隐式正则化的直观解释。
translated by 谷歌翻译