无线系统应用中深度学习(DL)的成功出现引起了人们对与安全有关的新挑战的担忧。一个这样的安全挑战是对抗性攻击。尽管已经有很多工作证明了基于DL的分类任务对对抗性攻击的敏感性,但是从攻击的角度来看,尚未对无线系统的基于回归的问题进行基于回归的问题。本文的目的是双重的:(i)我们在无线设置中考虑回归问题,并表明对抗性攻击可以打破基于DL的方法,并且(ii)我们将对抗性训练作为对抗性环境中的防御技术的有效性分析并表明基于DL的无线系统对攻击的鲁棒性有了显着改善。具体而言,本文考虑的无线应用程序是基于DL的功率分配,以多细胞大量多输入 - 销售输出系统的下行链路分配,攻击的目的是通过DL模型产生不可行的解决方案。我们扩展了基于梯度的对抗性攻击:快速梯度标志方法(FGSM),动量迭代FGSM和预计的梯度下降方法,以分析具有和没有对抗性训练的考虑的无线应用的敏感性。我们对这些攻击进行了分析深度神经网络(DNN)模型的性能,在这些攻击中,使用白色框和黑盒攻击制作了对抗性扰动。
translated by 谷歌翻译
Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial attacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address this issue, we propose a broad class of momentum-based iterative algorithms to boost adversarial attacks. By integrating the momentum term into the iterative process for attacks, our methods can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks. We hope that the proposed methods will serve as a benchmark for evaluating the robustness of various deep models and defense methods. With this method, we won the first places in NIPS 2017 Non-targeted Adversarial Attack and Targeted Adversarial Attack competitions.
translated by 谷歌翻译
本文提出了对基于深度学习的无线信号分类器的信道感知对抗攻击。有一个发射器,发送具有不同调制类型的信号。每个接收器使用深神经网络以将其超空气接收信号分类为调制类型。与此同时,对手将对手扰动(受到电力预算的影响)透射到欺骗接收器,以在作为透射信号叠加和对抗扰动的叠加接收的分类信号中进行错误。首先,当在设计对抗扰动时不考虑通道时,这些逃避攻击被证明会失败。然后,通过考虑来自每个接收器的对手的频道效应来提出现实攻击。在示出频道感知攻击是选择性的(即,它只影响扰动设计中的信道中考虑的接收器),通过制作常见的对抗扰动来呈现广播对抗攻击,以在不同接收器处同时欺骗分类器。通过占通道,发射机输入和分类器模型可用的不同信息,将调制分类器对过空中侵犯攻击的主要脆弱性。最后,引入了基于随机平滑的经过认证的防御,即增加了噪声训练数据,使调制分类器鲁棒到对抗扰动。
translated by 谷歌翻译
许多最先进的ML模型在各种任务中具有优于图像分类的人类。具有如此出色的性能,ML模型今天被广泛使用。然而,存在对抗性攻击和数据中毒攻击的真正符合ML模型的稳健性。例如,Engstrom等人。证明了最先进的图像分类器可以容易地被任意图像上的小旋转欺骗。由于ML系统越来越纳入安全性和安全敏感的应用,对抗攻击和数据中毒攻击构成了相当大的威胁。本章侧重于ML安全的两个广泛和重要的领域:对抗攻击和数据中毒攻击。
translated by 谷歌翻译
对抗攻击使他们的成功取得了“愚弄”DNN等,基于梯度的算法成为一个主流。基于线性假设[12],在$ \ ell_ \ infty $约束下,在梯度上应用于渐变的$符号$操作是生成扰动的良好选择。然而,存在来自这种操作的副作用,因为它导致真实梯度与扰动之间的方向偏差。换句话说,当前方法包含真实梯度和实际噪声之间的间隙,这导致偏置和低效的攻击。因此,在理论上,基于泰勒膨胀,偏差地分析了$ \符号$,即快速梯度非符号法(FGNM)的校正。值得注意的是,FGNM是一般例程,它可以在基于梯度的攻击中无缝地更换传统的$符号$操作,以可忽略的额外计算成本。广泛的实验证明了我们方法的有效性。具体来说,我们的大多数和\ textBF {27.5 \%}平均突出了它们,平均而言。我们的匿名代码是公开可用的:\ url {https://git.io/mm -fgnm}。
translated by 谷歌翻译
Deep neural networks (DNNs) are one of the most prominent technologies of our time, as they achieve state-of-the-art performance in many machine learning tasks, including but not limited to image classification, text mining, and speech processing. However, recent research on DNNs has indicated ever-increasing concern on the robustness to adversarial examples, especially for security-critical tasks such as traffic sign identification for autonomous driving. Studies have unveiled the vulnerability of a well-trained DNN by demonstrating the ability of generating barely noticeable (to both human and machines) adversarial images that lead to misclassification. Furthermore, researchers have shown that these adversarial images are highly transferable by simply training and attacking a substitute model built upon the target model, known as a black-box attack to DNNs.Similar to the setting of training substitute models, in this paper we propose an effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN. However, different from leveraging attack transferability from substitute models, we propose zeroth order optimization (ZOO) based attacks to directly estimate the gradients of the targeted DNN for generating adversarial examples. We use zeroth order stochastic coordinate descent along with dimension reduction, hierarchical attack and importance sampling techniques to * Pin-Yu Chen and Huan Zhang contribute equally to this work.
translated by 谷歌翻译
Adversarial examples are perturbed inputs designed to fool machine learning models. Adversarial training injects such examples into training data to increase robustness. To scale this technique to large datasets, perturbations are crafted using fast single-step methods that maximize a linear approximation of the model's loss. We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss. The model thus learns to generate weak perturbations, rather than defend against strong ones. As a result, we find that adversarial training remains vulnerable to black-box attacks, where we transfer perturbations computed on undefended models, as well as to a powerful novel single-step attack that escapes the non-smooth vicinity of the input data via a small random step. We further introduce Ensemble Adversarial Training, a technique that augments training data with perturbations transferred from other models. On ImageNet, Ensemble Adversarial Training yields models with stronger robustness to blackbox attacks. In particular, our most robust model won the first round of the NIPS 2017 competition on Defenses against Adversarial Attacks (Kurakin et al., 2017c). However, subsequent work found that more elaborate black-box attacks could significantly enhance transferability and reduce the accuracy of our models.
translated by 谷歌翻译
最近对机器学习(ML)模型的攻击,例如逃避攻击,具有对抗性示例,并通过提取攻击窃取了一些模型,构成了几种安全性和隐私威胁。先前的工作建议使用对抗性训练从对抗性示例中保护模型,以逃避模型的分类并恶化其性能。但是,这种保护技术会影响模型的决策边界及其预测概率,因此可能会增加模型隐私风险。实际上,仅使用对模型预测输出的查询访问的恶意用户可以提取它并获得高智能和高保真替代模型。为了更大的提取,这些攻击利用了受害者模型的预测概率。实际上,所有先前关于提取攻击的工作都没有考虑到出于安全目的的培训过程中的变化。在本文中,我们提出了一个框架,以评估具有视觉数据集对对抗训练的模型的提取攻击。据我们所知,我们的工作是第一个进行此类评估的工作。通过一项广泛的实证研究,我们证明了受对抗训练的模型比在自然训练情况下获得的模型更容易受到提取攻击的影响。他们可以达到高达$ \ times1.2 $更高的准确性和同意,而疑问低于$ \ times0.75 $。我们还发现,与从自然训练的(即标准)模型中提取的DNN相比,从鲁棒模型中提取的对抗性鲁棒性能力可通过提取攻击(即从鲁棒模型提取的深神经网络(DNN)提取的深神网络(DNN))传递。
translated by 谷歌翻译
与此同时,黑匣子对抗攻击已经吸引了令人印象深刻的注意,在深度学习安全领域的实际应用,同时,由于无法访问目标模型的网络架构或内部权重,非常具有挑战性。基于假设:如果一个例子对多种型号保持过逆势,那么它更有可能将攻击能力转移到其他模型,基于集合的对抗攻击方法是高效的,用于黑匣子攻击。然而,集合攻击的方式相当不那么调查,并且现有的集合攻击只是均匀地融合所有型号的输出。在这项工作中,我们将迭代集合攻击视为随机梯度下降优化过程,其中不同模型上梯度的变化可能导致众多局部Optima差。为此,我们提出了一种新的攻击方法,称为随机方差减少了整体(SVRE)攻击,这可以降低集合模型的梯度方差,并充分利用集合攻击。标准想象数据集的经验结果表明,所提出的方法可以提高对抗性可转移性,并且优于现有的集合攻击显着。
translated by 谷歌翻译
With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks have been recently found vulnerable to well-designed input samples, called adversarial examples. Adversarial perturbations are imperceptible to human but can easily fool deep neural networks in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying deep neural networks in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for deep neural networks, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples. In addition, three major challenges in adversarial examples and the potential solutions are discussed.
translated by 谷歌翻译
已知深度神经网络(DNN)容易受到用不可察觉的扰动制作的对抗性示例的影响,即,输入图像的微小变化会引起错误的分类,从而威胁着基于深度学习的部署系统的可靠性。经常采用对抗训练(AT)来通过训练损坏和干净的数据的混合物来提高DNN的鲁棒性。但是,大多数基于AT的方法在处理\ textit {转移的对抗示例}方面是无效的,这些方法是生成以欺骗各种防御模型的生成的,因此无法满足现实情况下提出的概括要求。此外,对抗性训练一般的国防模型不能对具有扰动的输入产生可解释的预测,而不同的领域专家则需要一个高度可解释的强大模型才能了解DNN的行为。在这项工作中,我们提出了一种基于Jacobian规范和选择性输入梯度正则化(J-SIGR)的方法,该方法通过Jacobian归一化提出了线性化的鲁棒性,还将基于扰动的显着性图正规化,以模仿模型的可解释预测。因此,我们既可以提高DNN的防御能力和高解释性。最后,我们评估了跨不同体系结构的方法,以针对强大的对抗性攻击。实验表明,提出的J-Sigr赋予了针对转移的对抗攻击的鲁棒性,我们还表明,来自神经网络的预测易于解释。
translated by 谷歌翻译
Though CNNs have achieved the state-of-the-art performance on various vision tasks, they are vulnerable to adversarial examples -crafted by adding human-imperceptible perturbations to clean images. However, most of the existing adversarial attacks only achieve relatively low success rates under the challenging black-box setting, where the attackers have no knowledge of the model structure and parameters. To this end, we propose to improve the transferability of adversarial examples by creating diverse input patterns. Instead of only using the original images to generate adversarial examples, our method applies random transformations to the input images at each iteration. Extensive experiments on ImageNet show that the proposed attack method can generate adversarial examples that transfer much better to different networks than existing baselines. By evaluating our method against top defense solutions and official baselines from NIPS 2017 adversarial competition, the enhanced attack reaches an average success rate of 73.0%, which outperforms the top-1 attack submission in the NIPS competition by a large margin of 6.6%. We hope that our proposed attack strategy can serve as a strong benchmark baseline for evaluating the robustness of networks to adversaries and the effectiveness of different defense methods in the future. Code is available at https: //github.com/cihangxie/DI-2-FGSM .
translated by 谷歌翻译
基于深度神经网络(DNN)的智能信息(IOT)系统已被广泛部署在现实世界中。然而,发现DNNS易受对抗性示例的影响,这提高了人们对智能物联网系统的可靠性和安全性的担忧。测试和评估IOT系统的稳健性成为必要和必要。最近已经提出了各种攻击和策略,但效率问题仍未纠正。现有方法是计算地广泛或耗时,这在实践中不适用。在本文中,我们提出了一种称为攻击启发GaN(AI-GaN)的新框架,在有条件地产生对抗性实例。曾经接受过培训,可以有效地给予对抗扰动的输入图像和目标类。我们在白盒设置的不同数据集中应用AI-GaN,黑匣子设置和由最先进的防御保护的目标模型。通过广泛的实验,AI-GaN实现了高攻击成功率,优于现有方法,并显着降低了生成时间。此外,首次,AI-GaN成功地缩放到复杂的数据集。 Cifar-100和Imagenet,所有课程中的成功率约为90美元。
translated by 谷歌翻译
在过去的几十年中,人工智能的兴起使我们有能力解决日常生活中最具挑战性的问题,例如癌症的预测和自主航行。但是,如果不保护对抗性攻击,这些应用程序可能不会可靠。此外,最近的作品表明,某些对抗性示例可以在不同的模型中转移。因此,至关重要的是避免通过抵抗对抗性操纵的强大模型进行这种可传递性。在本文中,我们提出了一种基于特征随机化的方法,该方法抵抗了八次针对测试阶段深度学习模型的对抗性攻击。我们的新方法包括改变目标网络分类器中的训练策略并选择随机特征样本。我们认为攻击者具有有限的知识和半知识条件,以进行最普遍的对抗性攻击。我们使用包括现实和合成攻击的众所周知的UNSW-NB15数据集评估了方法的鲁棒性。之后,我们证明我们的策略优于现有的最新方法,例如最强大的攻击,包括针对特定的对抗性攻击进行微调网络模型。最后,我们的实验结果表明,我们的方法可以确保目标网络并抵抗对抗性攻击的转移性超过60%。
translated by 谷歌翻译
时间序列数据在许多现实世界中(例如,移动健康)和深神经网络(DNNS)中产生,在解决它们方面已取得了巨大的成功。尽管他们成功了,但对他们对对抗性攻击的稳健性知之甚少。在本文中,我们提出了一个通过统计特征(TSA-STAT)}称为时间序列攻击的新型对抗框架}。为了解决时间序列域的独特挑战,TSA-STAT对时间序列数据的统计特征采取限制来构建对抗性示例。优化的多项式转换用于创建比基于加性扰动的攻击(就成功欺骗DNN而言)更有效的攻击。我们还提供有关构建对抗性示例的统计功能规范的认证界限。我们对各种现实世界基准数据集的实验表明,TSA-STAT在欺骗DNN的时间序列域和改善其稳健性方面的有效性。 TSA-STAT算法的源代码可在https://github.com/tahabelkhouja/time-series-series-attacks-via-statity-features上获得
translated by 谷歌翻译
尽管机器学习系统的效率和可扩展性,但最近的研究表明,许多分类方法,尤其是深神经网络(DNN),易受对抗的例子;即,仔细制作欺骗训练有素的分类模型的例子,同时无法区分从自然数据到人类。这使得在安全关键区域中应用DNN或相关方法可能不安全。由于这个问题是由Biggio等人确定的。 (2013)和Szegedy等人。(2014年),在这一领域已经完成了很多工作,包括开发攻击方法,以产生对抗的例子和防御技术的构建防范这些例子。本文旨在向统计界介绍这一主题及其最新发展,主要关注对抗性示例的产生和保护。在数值实验中使用的计算代码(在Python和R)公开可用于读者探讨调查的方法。本文希望提交人们将鼓励更多统计学人员在这种重要的令人兴奋的领域的产生和捍卫对抗的例子。
translated by 谷歌翻译
有必要提高某些特殊班级的表现,或者特别保护它们免受对抗学习的攻击。本文提出了一个将成本敏感分类和对抗性学习结合在一起的框架,以训练可以区分受保护和未受保护的类的模型,以使受保护的类别不太容易受到对抗性示例的影响。在此框架中,我们发现在训练深神经网络(称为Min-Max属性)期间,一个有趣的现象,即卷积层中大多数参数的绝对值。基于这种最小的最大属性,该属性是在随机分布的角度制定和分析的,我们进一步建立了一个针对对抗性示例的新防御模型,以改善对抗性鲁棒性。构建模型的一个优点是,它的性能比标准模型更好,并且可以与对抗性训练相结合,以提高性能。在实验上证实,对于所有类别的平均准确性,我们的模型在没有发生攻击时几乎与现有模型一样,并且在发生攻击时比现有模型更好。具体而言,关于受保护类的准确性,提议的模型比发生攻击时的现有模型要好得多。
translated by 谷歌翻译
Video compression plays a crucial role in video streaming and classification systems by maximizing the end-user quality of experience (QoE) at a given bandwidth budget. In this paper, we conduct the first systematic study for adversarial attacks on deep learning-based video compression and downstream classification systems. Our attack framework, dubbed RoVISQ, manipulates the Rate-Distortion ($\textit{R}$-$\textit{D}$) relationship of a video compression model to achieve one or both of the following goals: (1) increasing the network bandwidth, (2) degrading the video quality for end-users. We further devise new objectives for targeted and untargeted attacks to a downstream video classification service. Finally, we design an input-invariant perturbation that universally disrupts video compression and classification systems in real time. Unlike previously proposed attacks on video classification, our adversarial perturbations are the first to withstand compression. We empirically show the resilience of RoVISQ attacks against various defenses, i.e., adversarial training, video denoising, and JPEG compression. Our extensive experimental results on various video datasets show RoVISQ attacks deteriorate peak signal-to-noise ratio by up to 5.6dB and the bit-rate by up to $\sim$ 2.4$\times$ while achieving over 90$\%$ attack success rate on a downstream classifier. Our user study further demonstrates the effect of RoVISQ attacks on users' QoE.
translated by 谷歌翻译
Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples-inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. 1
translated by 谷歌翻译
时间序列异常检测在统计,经济学和计算机科学中进行了广泛的研究。多年来,使用基于深度学习的方法为时间序列异常检测提出了许多方法。这些方法中的许多方法都在基准数据集上显示了最先进的性能,给人一种错误的印象,即这些系统在许多实用和工业现实世界中都可以强大且可部署。在本文中,我们证明了最先进的异常检测方法的性能通过仅在传感器数据中添加小的对抗扰动来实质性地降解。我们使用不同的评分指标,例如预测错误,异常和分类评分,包括几个公共和私人数据集,从航空航天应用程序,服务器机器到发电厂的网络物理系统。在众所周知的对抗攻击中,来自快速梯度标志方法(FGSM)和预计梯度下降(PGD)方法,我们证明了最新的深神经网络(DNNS)和图形神经网络(GNNS)方法,这些方法声称这些方法是要对异常进行稳健,并且可能已集成在现实生活中,其性能下降到低至0%。据我们最好的理解,我们首次证明了针对对抗攻击的异常检测系统的脆弱性。这项研究的总体目标是提高对时间序列异常检测器的对抗性脆弱性的认识。
translated by 谷歌翻译