对手示例可以容易地降低神经网络中的分类性能。提出了促进这些例子的稳健性的实证方法,但往往缺乏分析见解和正式担保。最近,一些稳健性证书在文献中出现了基于系统理论概念的文献。这项工作提出了一种基于增量的耗散性的稳健性证书,用于每个层的线性矩阵不等式形式的神经网络。我们还提出了对该证书的等效光谱标准,该证书可扩展到具有多个层的神经网络。我们展示了对在MNIST培训的前馈神经网络上的对抗对抗攻击的性能和使用CIFAR-10训练的亚历纳特人。
translated by 谷歌翻译
隐式神经网络是一般的学习模型,可以用隐式代数方程替换传统的馈电模型中的层。与传统学习模型相比,隐式网络提供竞争性能和降低的内存消耗。然而,它们可以对输入对抗性扰动保持脆弱。本文提出了隐式神经网络的稳健性验证的理论和计算框架;我们的框架混合在一起混合单调系统理论和收缩理论。首先,给定隐式神经网络,我们介绍了一个相关的嵌入式网络,并显示,给定$ \ ell_ infty $ -norm框限制对输入,嵌入式网络提供$ \ ell_ \ idty $ -norm box超值给定网络的输出。其次,使用$ \ ell _ {\ infty} $ - 矩阵措施,我们为原始和嵌入式系统的良好提出了足够的条件,并设计了一种迭代算法来计算$ \ e _ {\ infty} $ - norm box鲁棒性利润率和可达性和分类问题。第三,独立价值,我们提出了一种新颖的相对分类器变量,导致认证问题的经过认证的对抗性鲁棒性更严格的界限。最后,我们对在Mnist DataSet上培训的非欧几里德单调运营商网络(Nemon)上进行数值模拟。在这些模拟中,我们比较了我们的混合单调对收缩方法的准确性和运行时间与文献中的现有鲁棒性验证方法,以估算认证的对抗性鲁棒性。
translated by 谷歌翻译
In this work, we propose a dissipativity-based method for Lipschitz constant estimation of 1D convolutional neural networks (CNNs). In particular, we analyze the dissipativity properties of convolutional, pooling, and fully connected layers making use of incremental quadratic constraints for nonlinear activation functions and pooling operations. The Lipschitz constant of the concatenation of these mappings is then estimated by solving a semidefinite program which we derive from dissipativity theory. To make our method as efficient as possible, we take the structure of convolutional layers into account realizing these finite impulse response filters as causal dynamical systems in state space and carrying out the dissipativity analysis for the state space realizations. The examples we provide show that our Lipschitz bounds are advantageous in terms of accuracy and scalability.
translated by 谷歌翻译
本文提出了一个理论和计算框架,用于基于非欧几里得收缩理论对隐式神经网络的训练和鲁棒性验证。基本思想是将神经网络的鲁棒性分析作为可及性问题,使用(i)$ \ ell _ {\ infty} $ - norm inort input-utput-optup-utput lipschitz常数和(ii)网络的紧密包含函数到过度陈列在其可达集合中。首先,对于给定的隐式神经网络,我们使用$ \ ell _ {\ infty} $ - 矩阵测量方法来为其适应性良好的条件提出足够的条件,设计一种迭代算法来计算其固定点,并为其$ \提供上限ell_ \ infty $ -Norm输入输出Lipschitz常数。其次,我们介绍了一个相关的嵌入式网络,并表明嵌入式网络可用于提供原始网络的可触及式集合的$ \ ell_ \ infty $ -Norm Box过度交配。此外,我们使用嵌入式网络来设计一种迭代算法,用于计算原始系统紧密包含函数的上限。第三,我们使用Lipschitz常数的上限和紧密包含函数的上限来设计两种算法,以训练和稳健性验证隐式神经网络。最后,我们应用算法在MNIST数据集上训练隐式神经网络,并将模型的鲁棒性与通过文献中现有方法训练的模型进行比较。
translated by 谷歌翻译
许多最先进的对抗性培训方法利用对抗性损失的上限来提供安全保障。然而,这些方法需要在每个训练步骤中计算,该步骤不能包含在梯度中的梯度以进行反向化。我们基于封闭形式的对抗性损失的封闭溶液引入了一种新的更具内容性的对抗性培训,可以有效地培养了背部衰退。通过稳健优化的最先进的工具促进了这一界限。我们使用我们的方法推出了两种新方法。第一种方法(近似稳健的上限或arub)使用网络的第一阶近似以及来自线性鲁棒优化的基本工具,以获得可以容易地实现的对抗丢失的近似偏置。第二种方法(鲁棒上限或摩擦)计算对抗性损失的精确上限。在各种表格和视觉数据集中,我们展示了我们更加原则的方法的有效性 - 摩擦比最先进的方法更强大,而是较大的扰动的最新方法,而谷会匹配的性能 - 小扰动的艺术方法。此外,摩擦和灌注速度比标准对抗性培训快(以牺牲内存增加)。重现结果的所有代码都可以在https://github.com/kimvc7/trobustness找到。
translated by 谷歌翻译
经认证的稳健性是安全关键应用中的深度神经网络的理想性质,流行的训练算法可以通过计算其Lipschitz常数的全球界限来认证神经网络的鲁棒性。然而,这种界限往往松动:它倾向于过度规范神经网络并降低其自然精度。绑定的Lipschitz绑定可以在自然和认证的准确性之间提供更好的权衡,但通常很难根据网络的非凸起计算。在这项工作中,我们通过考虑激活函数(例如Relu)和权重矩阵之间的相互作用,提出了一种有效和培训的\ emph {本地} Lipschitz上限。具体地,当计算权重矩阵的诱发标准时,我们消除了相应的行和列,其中保证激活函数在每个给定数据点的邻域中是常数,它提供比全局Lipschitz常数的可怕更严格的绑定神经网络。我们的方法可用作插入式模块,以拧紧在许多可认证的训练算法中绑定的Lipschitz。此外,我们建议夹住激活功能(例如,Relu和Maxmin),具有可读的上限阈值和稀疏性损失,以帮助网络实现甚至更严格的本地嘴唇尖端。在实验上,我们表明我们的方法始终如一地优于Mnist,CiFar-10和Tinyimagenet数据集的清洁和认证准确性,具有各种网络架构的清洁和认证的准确性。
translated by 谷歌翻译
研究神经网络中重量扰动的敏感性及其对模型性能的影响,包括泛化和鲁棒性,是一种积极的研究主题,因为它对模型压缩,泛化差距评估和对抗攻击等诸如模型压缩,泛化差距评估和对抗性攻击的广泛机器学习任务。在本文中,我们在重量扰动下的鲁棒性方面提供了前馈神经网络的第一积分研究和分析及其在体重扰动下的泛化行为。我们进一步设计了一种新的理论驱动损失功能,用于培训互动和强大的神经网络免受重量扰动。进行实证实验以验证我们的理论分析。我们的结果提供了基本洞察,以表征神经网络免受重量扰动的泛化和鲁棒性。
translated by 谷歌翻译
While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses. Can we somehow end this arms race? In this work, we study this problem for neural networks with one hidden layer. We first propose a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value. Second, as this certificate is differentiable, we jointly optimize it with the network parameters, providing an adaptive regularizer that encourages robustness against all attacks. On MNIST, our approach produces a network and a certificate that no attack that perturbs each pixel by at most = 0.1 can cause more than 35% test error.
translated by 谷歌翻译
To rigorously certify the robustness of neural networks to adversarial perturbations, most state-of-the-art techniques rely on a triangle-shaped linear programming (LP) relaxation of the ReLU activation. While the LP relaxation is exact for a single neuron, recent results suggest that it faces an inherent "convex relaxation barrier" as additional activations are added, and as the attack budget is increased. In this paper, we propose a nonconvex relaxation for the ReLU relaxation, based on a low-rank restriction of a semidefinite programming (SDP) relaxation. We show that the nonconvex relaxation has a similar complexity to the LP relaxation, but enjoys improved tightness that is comparable to the much more expensive SDP relaxation. Despite nonconvexity, we prove that the verification problem satisfies constraint qualification, and therefore a Riemannian staircase approach is guaranteed to compute a near-globally optimal solution in polynomial time. Our experiments provide evidence that our nonconvex relaxation almost completely overcome the "convex relaxation barrier" faced by the LP relaxation.
translated by 谷歌翻译
收缩理论是一种分析工具,用于研究以均匀的正面矩阵定义的收缩度量下的非自主(即,时变)非线性系统的差动动力学,其存在导致增量指数的必要和充分表征多种溶液轨迹彼此相互稳定性的稳定性。通过使用平方差分长度作为Lyapunov样功能,其非线性稳定性分析向下沸腾以找到满足以表达为线性矩阵不等式的稳定条件的合适的收缩度量,表明可以在众所周知的线性系统之间绘制许多平行线非线性系统理论与收缩理论。此外,收缩理论利用了与比较引理结合使用的指数稳定性的优越稳健性。这产生了基于神经网络的控制和估计方案的急需安全性和稳定性保证,而不借助使用均匀渐近稳定性的更涉及的输入到状态稳定性方法。这种独特的特征允许通过凸优化来系统构造收缩度量,从而获得了由于扰动和学习误差而在外部扰动的时变的目标轨迹和解决方案轨迹之间的距离上的明确指数界限。因此,本文的目的是介绍了收缩理论的课程概述及其在确定性和随机系统的非线性稳定性分析中的优点,重点导出了各种基于学习和数据驱动的自动控制方法的正式鲁棒性和稳定性保证。特别是,我们提供了使用深神经网络寻找收缩指标和相关控制和估计法的技术的详细审查。
translated by 谷歌翻译
对于深层网络而言,这是一个非常理想的属性,可与小型输入更改保持强大。实现此属性的一种流行方法是设计具有小Lipschitz常数的网络。在这项工作中,我们提出了一种用于构建具有许多理想属性的Lipschitz网络的新技术:它可以应用于任何线性网络层(完全连接或卷积),它在Lipschitz常数上提供了正式的保证,它是易于实施和运行效率,可以与任何培训目标和优化方法结合使用。实际上,我们的技术是文献中第一个同时实现所有这些属性的技术。我们的主要贡献是基于重新的重量矩阵参数化,该参数保证每个网络层最多具有LIPSCHITZ常数,并且导致学习的权重矩阵接近正交。因此,我们称这种层几乎是正交的Lipschitz(AOL)。在图像分类的背景下,实验和消融研究具有认证的鲁棒精度证实,AOL层获得与大多数现有方法相当的结果。但是,它们更容易实现,并且更广泛地适用,因为它们不需要计算昂贵的矩阵正交化或反转步骤作为网络体系结构的一部分。我们在https://github.com/berndprach/aol上提供代码。
translated by 谷歌翻译
本文涉及在Semidefinite限制下培训神经网络(NNS)。这种类型的训练问题最近获得了普及,因为半纤维约束可以用于验证包括例如嘴唇峰常数上限的NN的有趣特性,这与NN的鲁棒性或稳定性有关具有NN控制器的动态系统。使用的SemideFinite约束基于底层激活函数满足的扇区约束。遗憾的是,这些新结果的最大瓶颈之一是将Semidefinite限制纳入NNS的训练所需的计算工作,这限制了它们对大NN的可扩展性。我们通过开发NN培训的内部点方法来解决这一挑战,我们使用屏障函数为SEMIDEFINITE约束实现。为了有效地计算屏障术语的梯度,我们利用了半纤维限制的结构。在实验中,我们展示了我们对先前方法的培训方法的卓越效率,这使我们可以在培训Wassersein生成的对抗网络中使用Semidefinite限制,其中鉴别者必须满足Lipschitz条件。
translated by 谷歌翻译
We develop and study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence (AI) systems including deep learning neural networks. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself. Such a stealth attack could be conducted by a mischievous, corrupt or disgruntled member of a software development team. It could also be made by those wishing to exploit a ``democratization of AI'' agenda, where network architectures and trained parameter sets are shared publicly. We develop a range of new implementable attack strategies with accompanying analysis, showing that with high probability a stealth attack can be made transparent, in the sense that system performance is unchanged on a fixed validation set which is unknown to the attacker, while evoking any desired output on a trigger input of interest. The attacker only needs to have estimates of the size of the validation set and the spread of the AI's relevant latent space. In the case of deep learning neural networks, we show that a one neuron attack is possible - a modification to the weights and bias associated with a single neuron - revealing a vulnerability arising from over-parameterization. We illustrate these concepts using state of the art architectures on two standard image data sets. Guided by the theory and computational results, we also propose strategies to guard against stealth attacks.
translated by 谷歌翻译
Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks. Motivated by the recent advances in fixed-time stability theory of continuous-time dynamical systems, we introduce a generalized framework for designing accelerated optimization algorithms with strongest convergence guarantees that further extend to a subclass of non-convex functions. In particular, we introduce the \emph{GenFlow} algorithm and its momentum variant that provably converge to the optimal solution of objective functions satisfying the Polyak-{\L}ojasiewicz (PL) inequality, in a fixed-time. Moreover for functions that admit non-degenerate saddle-points, we show that for the proposed GenFlow algorithm, the time required to evade these saddle-points is bounded uniformly for all initial conditions. Finally, for strongly convex-strongly concave minimax problems whose optimal solution is a saddle point, a similar scheme is shown to arrive at the optimal solution again in a fixed-time. The superior convergence properties of our algorithm are validated experimentally on a variety of benchmark datasets.
translated by 谷歌翻译
Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NPcomplete problem. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or deliver low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms (Fast-Lin,Fast-Lip) that are able to certify non-trivial lower bounds of minimum adversarial distortions. Experiments show that (1) our methods deliver bounds close to (the gap is 2-3X) exact minimum distortions found by Reluplex in small networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that there is no polynomial time algorithm that can approximately find the minimum 1 adversarial distortion of a ReLU network with a 0.99 ln n approximation ratio unless NP=P, where n is the number of neurons in the network.
translated by 谷歌翻译
深神经网络可能会脆弱,并且对小输入扰动可能会导致输出发生重大变化。在本文中,我们采用收缩理论来改善神经odes的鲁棒性(节点)。如果所有具有不同初始条件的解决方案相互融合,则动态系统是合同的。结果,随着时间的推移,在初始条件下的扰动变得越来越少。由于在节点中,输入数据对应于动态系统的初始条件,因此我们显示合同性可以减轻输入扰动的效果。更准确地说,受到哈密顿动力学的节点的启发,我们提出了一类收缩性汉密尔顿节点(CH节点)。通过正确调整标量参数,CH节点可以通过设计确保合并性,并且可以使用标准反向传播进行培训。此外,CH-Nodes享受内置的非爆炸梯度保证,这确保了良好的培训过程。最后,我们证明了CH节点在MNIST图像分类问题上使用嘈杂的测试数据的鲁棒性。
translated by 谷歌翻译
Recurrent neural networks are capable of learning the dynamics of an unknown nonlinear system purely from input-output measurements. However, the resulting models do not provide any stability guarantees on the input-output mapping. In this work, we represent a recurrent neural network as a linear time-invariant system with nonlinear disturbances. By introducing constraints on the parameters, we can guarantee finite gain stability and incremental finite gain stability. We apply this identification method to learn the motion of a four-degrees-of-freedom ship that is moving in open water and compare it against other purely learning-based approaches with unconstrained parameters. Our analysis shows that the constrained recurrent neural network has a lower prediction accuracy on the test set, but it achieves comparable results on an out-of-distribution set and respects stability conditions.
translated by 谷歌翻译
Deep Neural Networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems and include several existing DNN architectures based on ordinary differential equations. Our main result is that a broad set of H-DNNs ensures non-vanishing gradients by design for an arbitrary network depth. This is obtained by proving that, using a semi-implicit Euler discretization scheme, the backward sensitivity matrices involved in gradient computations are symplectic. We also provide an upper-bound to the magnitude of sensitivity matrices and show that exploding gradients can be controlled through regularization. Finally, we enable distributed implementations of backward and forward propagation algorithms in H-DNNs by characterizing appropriate sparsity constraints on the weight matrices. The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.
translated by 谷歌翻译
Lipschitz Bound估计是使深度神经网络正规化以使其可抵抗对抗性攻击的有效方法。这在从加强学习到自主系统的各种应用中很有用。在本文中,我们强调了卷积神经网络(CNN)获得非平凡的Lipschitz结合证书的显着差距,并通过广泛的图形分析在经验上支持它。我们还表明,可以使用展开的卷积层或Toeplitz矩阵将卷积神经网络(CNN)转换为完全连接的网络。此外,我们提出了一种简单的算法,以显示实际Lipschitz常数和所获得的紧密结合之间的特定数据分布中现有的20x-50x差距。我们还对各种网络体系结构进行了一组彻底的实验,并在MNIST和CIFAR-10等数据集上进行基准测试。所有这些建议都通过广泛的测试,图形,直方图和比较分析来支持。
translated by 谷歌翻译
标准化技术已成为现代卷积神经网络(Convnets)中的基本组件。特别是,许多最近的作品表明,促进重量的正交性有助于培训深层模型并提高鲁棒性。对于Courmnets,大多数现有方法基于惩罚或归一化矩阵判断或施加卷积核的重量矩阵。这些方法经常摧毁或忽视核的良性卷积结构;因此,对于深扫描器来说,它们通常是昂贵或不切实际的。相比之下,我们介绍了一种简单富有高效的“卷积归一化”(ConvNORM)方法,可以充分利用傅立叶域中的卷积结构,并用作简单的即插即用模块,以方便地结合到任何围栏中。我们的方法是通过最近关于卷积稀疏编码的预处理方法的工作启发,可以有效地促进每个层的频道方向等距。此外,我们表明我们的判断可以降低重量矩阵的层状频谱标准,从而改善网络的嘴唇,导致培训更容易培训和改善深扫描器的鲁棒性。在噪声损坏和生成的对抗网络(GAN)下应用于分类,我们表明CONVNOMOL提高了常见扫描仪(如RENET和GAN性能)的稳健性。我们通过Cifar和Imagenet的数值实验验证了我们的研究结果。
translated by 谷歌翻译