Machine learning models are often susceptible to adversarial perturbations of their inputs. Even small perturbations can cause state-of-the-art classifiers with high "standard" accuracy to produce an incorrect prediction with high confidence. To better understand this phenomenon, we study adversarially robust learning from the viewpoint of generalization. We show that already in a simple natural data model, the sample complexity of robust learning can be significantly larger than that of "standard" learning. This gap is information theoretic and holds irrespective of the training algorithm or the model family. We complement our theoretical results with experiments on popular image classification datasets and show that a similar gap exists here as well. We postulate that the difficulty of training robust classifiers stems, at least partially, from this inherently larger sample complexity.
translated by 谷歌翻译
We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.
translated by 谷歌翻译
Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features (derived from patterns in the data distribution) that are highly predictive, yet brittle and (thus) incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.
translated by 谷歌翻译
在监督的学习中,已经表明,在许多情况下,数据中的标签噪声可以插值而不会受到测试准确性的处罚。我们表明,插值标签噪声会引起对抗性脆弱性,并证明了第一个定理显示标签噪声和对抗性风险在数据分布方面的依赖性。我们的结果几乎是尖锐的,而没有考虑学习算法的电感偏差。我们还表明,感应偏置使标签噪声的效果更强。
translated by 谷歌翻译
We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models. In the first model, we are given $n$ i.i.d. samples from the distribution $\mathcal{N}\left(\theta,I_d\right)$ (with unknown $\theta$), of which a small fraction has been arbitrarily corrupted. Under the promise that $\|\theta\|_0\le s$, we want to correctly distinguish whether $\|\theta\|_2=0$ or $\|\theta\|_2>\gamma$, for some input parameter $\gamma>0$. We show that any algorithm for this task requires $n=\Omega\left(s\log\frac{ed}{s}\right)$ samples, which is tight up to logarithmic factors. We also extend our results to other common notions of sparsity, namely, $\|\theta\|_q\le s$ for any $0 < q < 2$. In the second observation model that we consider, the data is generated according to a sparse linear regression model, where the covariates are i.i.d. Gaussian and the regression coefficient (signal) is known to be $s$-sparse. Here too we assume that an $\epsilon$-fraction of the data is arbitrarily corrupted. We show that any algorithm that reliably tests the norm of the regression coefficient requires at least $n=\Omega\left(\min(s\log d,{1}/{\gamma^4})\right)$ samples. Our results show that the complexity of testing in these two settings significantly increases under robustness constraints. This is in line with the recent observations made in robust mean testing and robust covariance testing.
translated by 谷歌翻译
我们考虑使用对抗鲁棒性学习的样本复杂性。对于此问题的大多数现有理论结果已经考虑了数据中不同类别在一起或重叠的设置。通过一些实际应用程序,我们认为,相比之下,存在具有完美精度和稳健性的分类器的分类器的良好分离的情况,并表明样品复杂性叙述了一个完全不同的故事。具体地,对于线性分类器,我们显示了大类分离的分布式,其中任何算法的预期鲁棒丢失至少是$ \ω(\ FRAC {D} {n})$,而最大边距算法已预期标准亏损$ o(\ frac {1} {n})$。这表明了通过现有技术不能获得的标准和鲁棒损耗中的间隙。另外,我们介绍了一种算法,给定鲁棒率半径远小于类之间的间隙的实例,给出了预期鲁棒损失的解决方案是$ O(\ FRAC {1} {n})$。这表明,对于非常好的数据,可实现$ O(\ FRAC {1} {n})$的收敛速度,否则就是这样。我们的结果适用于任何$ \ ell_p $ norm以$ p> 1 $(包括$ p = \ idty $)为稳健。
translated by 谷歌翻译
State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set and are known to fit well even random labels. This tendency to memorize the labels of the training data is not explained by existing theoretical analyses. Memorization of the training data also presents significant privacy risks when the training data contains sensitive personal information and thus it is important to understand whether such memorization is necessary for accurate learning.We provide the first conceptual explanation and a theoretical model for this phenomenon. Specifically, we demonstrate that for natural data distributions memorization of labels is necessary for achieving closeto-optimal generalization error. Crucially, even labels of outliers and noisy labels need to be memorized. The model is motivated and supported by the results of several recent empirical works. In our model, data is sampled from a mixture of subpopulations and our results show that memorization is necessary whenever the distribution of subpopulation frequencies is long-tailed. Image and text data is known to be long-tailed and therefore our results establish a formal link between these empirical phenomena. Our results allow to quantify the cost of limiting memorization in learning and explain the disparate effects that privacy and model compression have on different subgroups.
translated by 谷歌翻译
Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples-inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. 1
translated by 谷歌翻译
我们理论上和经验地证明,对抗性鲁棒性可以显着受益于半体验学习。从理论上讲,我们重新审视了Schmidt等人的简单高斯模型。这显示了标准和稳健分类之间的示例复杂性差距。我们证明了未标记的数据桥接这种差距:简单的半体验学习程序(自我训练)使用相同数量的达到高标准精度所需的标签实现高的强大精度。经验上,我们增强了CiFar-10,使用50万微小的图像,使用了8000万微小的图像,并使用强大的自我训练来优于最先进的鲁棒精度(i)$ \ ell_ infty $鲁棒性通过对抗培训和(ii)认证$ \ ell_2 $和$ \ ell_ \ infty $鲁棒性通过随机平滑的几个强大的攻击。在SVHN上,添加DataSet自己的额外训练集,删除的标签提供了4到10个点的增益,在使用额外标签的1点之内。
translated by 谷歌翻译
尽管使用对抗性训练捍卫深度学习模型免受对抗性扰动的经验成功,但到目前为止,仍然不清楚对抗性扰动的存在背后的原则是什么,而对抗性培训对神经网络进行了什么来消除它们。在本文中,我们提出了一个称为特征纯化的原则,在其中,我们表明存在对抗性示例的原因之一是在神经网络的训练过程中,在隐藏的重量中积累了某些小型密集混合物;更重要的是,对抗训练的目标之一是去除此类混合物以净化隐藏的重量。我们介绍了CIFAR-10数据集上的两个实验,以说明这一原理,并且一个理论上的结果证明,对于某些自然分类任务,使用随机初始初始化的梯度下降训练具有RELU激活的两层神经网络确实满足了这一原理。从技术上讲,我们给出了我们最大程度的了解,第一个结果证明,以下两个可以同时保持使用RELU激活的神经网络。 (1)对原始数据的训练确实对某些半径的小对抗扰动确实不舒适。 (2)即使使用经验性扰动算法(例如FGM),实际上也可以证明对对抗相同半径的任何扰动也可以证明具有强大的良好性。最后,我们还证明了复杂性的下限,表明该网络的低复杂性模型,例如线性分类器,低度多项式或什至是神经切线核,无论使用哪种算法,都无法防御相同半径的扰动训练他们。
translated by 谷歌翻译
成功的深度学习模型往往涉及培训具有比训练样本数量更多的参数的神经网络架构。近年来已经广泛研究了这种超分子化的模型,并且通过双下降现象和通过优化景观的结构特性,从统计的角度和计算视角都建立了过分统计化的优点。尽管在过上分层的制度中深入学习架构的显着成功,但也众所周知,这些模型对其投入中的小对抗扰动感到高度脆弱。即使在普遍培训的情况下,它们在扰动输入(鲁棒泛化)上的性能也会比良性输入(标准概括)的最佳可达到的性能更糟糕。因此,必须了解如何从根本上影响稳健性的情况下如何影响鲁棒性。在本文中,我们将通过专注于随机特征回归模型(具有随机第一层权重的两层神经网络)来提供超分度化对鲁棒性的作用的精确表征。我们考虑一个制度,其中样本量,输入维度和参数的数量彼此成比例地生长,并且当模型发生前列地训练时,可以为鲁棒泛化误差导出渐近精确的公式。我们的发达理论揭示了过分统计化对鲁棒性的非竞争效果,表明对于普遍训练的随机特征模型,高度公正化可能会损害鲁棒泛化。
translated by 谷歌翻译
我们在高斯分布下使用Massart噪声与Massart噪声进行PAC学习半个空间的问题。在Massart模型中,允许对手将每个点$ \ mathbf {x} $的标签与未知概率$ \ eta(\ mathbf {x})\ leq \ eta $,用于某些参数$ \ eta \ [0,1 / 2] $。目标是找到一个假设$ \ mathrm {opt} + \ epsilon $的错误分类错误,其中$ \ mathrm {opt} $是目标半空间的错误。此前已经在两个假设下研究了这个问题:(i)目标半空间是同质的(即,分离超平面通过原点),并且(ii)参数$ \ eta $严格小于$ 1/2 $。在此工作之前,当除去这些假设中的任何一个时,不知道非增长的界限。我们研究了一般问题并建立以下内容:对于$ \ eta <1/2 $,我们为一般半个空间提供了一个学习算法,采用样本和计算复杂度$ d ^ {o_ {\ eta}(\ log(1 / \ gamma) )))}} \ mathrm {poly}(1 / \ epsilon)$,其中$ \ gamma = \ max \ {\ epsilon,\ min \ {\ mathbf {pr} [f(\ mathbf {x})= 1], \ mathbf {pr} [f(\ mathbf {x})= -1] \} \} $是目标半空间$ f $的偏差。现有的高效算法只能处理$ \ gamma = 1/2 $的特殊情况。有趣的是,我们建立了$ d ^ {\ oomega(\ log(\ log(\ log(\ log))}}的质量匹配的下限,而是任何统计查询(SQ)算法的复杂性。对于$ \ eta = 1/2 $,我们为一般半空间提供了一个学习算法,具有样本和计算复杂度$ o_ \ epsilon(1)d ^ {o(\ log(1 / epsilon))} $。即使对于均匀半空间的子类,这个结果也是新的;均匀Massart半个空间的现有算法为$ \ eta = 1/2 $提供可持续的保证。我们与D ^ {\ omega(\ log(\ log(\ log(\ log(\ epsilon))} $的近似匹配的sq下限补充了我们的上限,这甚至可以为同类半空间的特殊情况而保持。
translated by 谷歌翻译
Learned classifiers should often possess certain invariance properties meant to encourage fairness, robustness, or out-of-distribution generalization. However, multiple recent works empirically demonstrate that common invariance-inducing regularizers are ineffective in the over-parameterized regime, in which classifiers perfectly fit (i.e. interpolate) the training data. This suggests that the phenomenon of ``benign overfitting," in which models generalize well despite interpolating, might not favorably extend to settings in which robustness or fairness are desirable. In this work we provide a theoretical justification for these observations. We prove that -- even in the simplest of settings -- any interpolating learning rule (with arbitrarily small margin) will not satisfy these invariance properties. We then propose and analyze an algorithm that -- in the same setting -- successfully learns a non-interpolating classifier that is provably invariant. We validate our theoretical observations on simulated data and the Waterbirds dataset.
translated by 谷歌翻译
高维统计数据的一个基本目标是检测或恢复嘈杂数据中隐藏的种植结构(例如低级别矩阵)。越来越多的工作研究低级多项式作为此类问题的计算模型的限制模型:在各种情况下,数据的低级多项式可以与最知名的多项式时间算法的统计性能相匹配。先前的工作已经研究了低度多项式的力量,以检测隐藏结构的存在。在这项工作中,我们将这些方法扩展到解决估计和恢复问题(而不是检测)。对于大量的“信号加噪声”问题,我们给出了一个用户友好的下限,以获得最佳的均衡误差。据我们所知,这些是建立相关检测问题的恢复问题低度硬度的第一个结果。作为应用,我们对种植的子静脉和种植的密集子图问题的低度最小平方误差进行了严格的特征,在两种情况下都解决了有关恢复的计算复杂性的开放问题(在低度框架中)。
translated by 谷歌翻译
We establish a simple connection between robust and differentially-private algorithms: private mechanisms which perform well with very high probability are automatically robust in the sense that they retain accuracy even if a constant fraction of the samples they receive are adversarially corrupted. Since optimal mechanisms typically achieve these high success probabilities, our results imply that optimal private mechanisms for many basic statistics problems are robust. We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems. Assuming the Brennan-Bresler secret-leakage planted clique conjecture, we demonstrate a fundamental tradeoff between computational efficiency, privacy leakage, and success probability for sparse mean estimation. Private algorithms which match this tradeoff are not yet known -- we achieve that (up to polylogarithmic factors) in a polynomially-large range of parameters via the Sum-of-Squares method. To establish an information-computation gap for private sparse mean estimation, we also design new (exponential-time) mechanisms using fewer samples than efficient algorithms must use. Finally, we give evidence for privacy-induced information-computation gaps for several other statistics and learning problems, including PAC learning parity functions and estimation of the mean of a multivariate Gaussian.
translated by 谷歌翻译
我们研究了Massart噪声的PAC学习半圆的问题。给定标记的样本$(x,y)$从$ \ mathbb {r} ^ {d} ^ {d} \ times \ times \ {\ pm 1 \} $,这样的例子是任意的和标签$ y $ y $ y $ x $是由按萨塔特对手损坏的目标半空间与翻转概率$ \ eta(x)\ leq \ eta \ leq 1/2 $,目标是用小小的假设计算假设错误分类错误。这个问题的最佳已知$ \ mathrm {poly}(d,1 / \ epsilon)$时间算法实现$ \ eta + \ epsilon $的错误,这可能远离$ \ mathrm {opt} +的最佳界限\ epsilon $,$ \ mathrm {opt} = \ mathbf {e} _ {x \ sim d_x} [\ eta(x)] $。虽然已知实现$ \ mathrm {opt} + O(1)$误差需要超级多项式时间在统计查询模型中,但是在已知的上限和下限之间存在大的间隙。在这项工作中,我们基本上表征了统计查询(SQ)模型中Massart HalfSpaces的有效可读性。具体来说,我们表明,在$ \ mathbb {r} ^ d $中没有高效的sq算法用于学习massart halfpaces ^ d $可以比$ \ omega(\ eta)$更好地实现错误,即使$ \ mathrm {opt} = 2 ^ { - - \ log ^ {c}(d)$,适用于任何通用常量$ c \ in(0,1)$。此外,当噪声上限$ \ eta $接近$ 1/2 $时,我们的错误下限变为$ \ eta - o _ {\ eta}(1)$,其中$ o _ {\ eta}(1)$当$ \ eta $接近$ 1/2 $时,术语达到0美元。我们的结果提供了强有力的证据表明,大规模半空间的已知学习算法几乎是最可能的,从而解决学习理论中的长期开放问题。
translated by 谷歌翻译
“良性过度装备”,分类器记住嘈杂的培训数据仍然达到良好的概括性表现,在机器学习界造成了很大的关注。为了解释这种令人惊讶的现象,一系列作品在过度参数化的线性回归,分类和内核方法中提供了理论典范。然而,如果在对逆势实例存在下仍发生良性的过度,则尚不清楚,即欺骗分类器的微小和有意的扰动的例子。在本文中,我们表明,良性过度确实发生在对抗性培训中,是防御对抗性实例的原则性的方法。详细地,我们证明了在$ \ ell_p $普发的扰动下的子高斯数据的混合中的普遍培训的线性分类器的风险限制。我们的结果表明,在中度扰动下,尽管过度禁止嘈杂的培训数据,所以发生前列训练的线性分类器可以实现近乎最佳的标准和对抗性风险。数值实验验证了我们的理论发现。
translated by 谷歌翻译
可实现和不可知性的可读性的等价性是学习理论的基本现象。与PAC学习和回归等古典设置范围的变种,近期趋势,如对冲强劲和私人学习,我们仍然缺乏统一理论;等同性的传统证据往往是不同的,并且依赖于强大的模型特异性假设,如统一的收敛和样本压缩。在这项工作中,我们给出了第一个独立的框架,解释了可实现和不可知性的可读性的等价性:三行黑箱减少简化,统一,并在各种各样的环境中扩展了我们的理解。这包括没有已知的学报的模型,例如学习任意分布假设或一般损失,以及许多其他流行的设置,例如强大的学习,部分学习,公平学习和统计查询模型。更一般地,我们认为可实现和不可知的学习的等价性实际上是我们调用属性概括的更广泛现象的特殊情况:可以满足有限的学习算法(例如\噪声公差,隐私,稳定性)的任何理想性质假设类(可能在某些变化中)延伸到任何学习的假设类。
translated by 谷歌翻译
我们研究了在存在$ \ epsilon $ - 对抗异常值的高维稀疏平均值估计的问题。先前的工作为此任务获得了该任务的样本和计算有效算法,用于辅助性Subgaussian分布。在这项工作中,我们开发了第一个有效的算法,用于强大的稀疏平均值估计,而没有对协方差的先验知识。对于$ \ Mathbb r^d $上的分布,带有“认证有限”的$ t $ tum-矩和足够轻的尾巴,我们的算法达到了$ o(\ epsilon^{1-1/t})$带有样品复杂性$的错误(\ epsilon^{1-1/t}) m =(k \ log(d))^{o(t)}/\ epsilon^{2-2/t} $。对于高斯分布的特殊情况,我们的算法达到了$ \ tilde o(\ epsilon)$的接近最佳错误,带有样品复杂性$ m = o(k^4 \ mathrm {polylog}(d)(d))/\ epsilon^^ 2 $。我们的算法遵循基于方形的总和,对算法方法的证明。我们通过统计查询和低度多项式测试的下限来补充上限,提供了证据,表明我们算法实现的样本时间 - 错误权衡在质量上是最好的。
translated by 谷歌翻译
后门数据中毒攻击是一种对抗的攻击,其中攻击者将几个水印,误标记的训练示例注入训练集中。水印不会影响典型数据模型的测试时间性能;但是,该模型在水印示例中可靠地错误。为获得对后门数据中毒攻击的更好的基础认识,我们展示了一个正式的理论框架,其中一个人可以讨论对分类问题的回溯数据中毒攻击。然后我们使用它来分析这些攻击的重要统计和计算问题。在统计方面,我们识别一个参数,我们称之为记忆能力,捕捉到后门攻击的学习问题的内在脆弱性。这使我们能够争论几个自然学习问题的鲁棒性与后门攻击。我们的结果,攻击者涉及介绍后门攻击的明确建设,我们的鲁棒性结果表明,一些自然问题设置不能产生成功的后门攻击。从计算的角度来看,我们表明,在某些假设下,对抗训练可以检测训练集中的后门的存在。然后,我们表明,在类似的假设下,我们称之为呼叫滤波和鲁棒概括的两个密切相关的问题几乎等同。这意味着它既是渐近必要的,并且足以设计算法,可以识别训练集中的水印示例,以便获得既广泛概念的学习算法,以便在室外稳健。
translated by 谷歌翻译