Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visually similar while preserving the privacy of the input images. Several efforts from industry and academia propose to conduct content scanning on client devices such as smartphones due to the impending roll out of end-to-end encryption that will make server-side content scanning difficult. However, these proposals have met with strong criticism because of the potential for the technology to be misused and re-purposed. Our work informs this conversation by experimentally characterizing the potential for one type of misuse -- attackers manipulating the content scanning system to perform physical surveillance on target locations. Our contributions are threefold: (1) we offer a definition of physical surveillance in the context of client-side image scanning systems; (2) we experimentally characterize this risk and create a surveillance algorithm that achieves physical surveillance rates of >40% by poisoning 5% of the perceptual hash database; (3) we experimentally study the trade-off between the robustness of client-side image scanning systems and surveillance, showing that more robust detection of illegal material leads to increased potential for physical surveillance.
translated by 谷歌翻译
Apple最近透露了它的深度感知散列系统的神经枢纽,以检测文件在文件上传到其iCloud服务之前的用户设备上的儿童性滥用材料(CSAM)。关于保护用户隐私和系统可靠性的公众批评很快就会出现。本文基于神经枢纽的深度感知哈希展示了第一综合实证分析。具体而言,我们表明当前深度感知散列可能不具有稳健性。对手可以通过应用图像的略微变化来操纵散列值,或者通过基于梯度的方法或简单地执行标准图像转换,强制或预防哈希冲突来操纵。这种攻击允许恶意演员轻松利用检测系统:从隐藏滥用材料到框架无辜的用户,一切都是可能的。此外,使用散列值,仍然可以对存储在用户设备上的数据进行推断。在我们的观点中,根据我们的结果,其目前形式的深度感知散列通常不适用于强大的客户端扫描,不应从隐私角度使用。
translated by 谷歌翻译
已知深度学习系统容易受到对抗例子的影响。特别是,基于查询的黑框攻击不需要深入学习模型的知识,而可以通过提交查询和检查收益来计算网络上的对抗示例。最近的工作在很大程度上提高了这些攻击的效率,证明了它们在当今的ML-AS-A-Service平台上的实用性。我们提出了Blacklight,这是针对基于查询的黑盒对抗攻击的新防御。推动我们设计的基本见解是,为了计算对抗性示例,这些攻击在网络上进行了迭代优化,从而在输入空间中产生了非常相似的图像查询。 Blacklight使用在概率内容指纹上运行的有效相似性引擎来检测高度相似的查询来检测基于查询的黑盒攻击。我们根据各种模型和图像分类任务对八次最先进的攻击进行评估。 Blacklight通常只有几次查询后,都可以识别所有这些。通过拒绝所有检测到的查询,即使攻击者在帐户禁令或查询拒绝之后持续提交查询,Blacklight也可以防止任何攻击完成。 Blacklight在几个强大的对策中也很强大,包括最佳的黑盒攻击,该攻击近似于效率的白色框攻击。最后,我们说明了黑光如何推广到其他域,例如文本分类。
translated by 谷歌翻译
Although Deep Neural Networks (DNNs) have achieved impressive results in computer vision, their exposed vulnerability to adversarial attacks remains a serious concern. A series of works has shown that by adding elaborate perturbations to images, DNNs could have catastrophic degradation in performance metrics. And this phenomenon does not only exist in the digital space but also in the physical space. Therefore, estimating the security of these DNNs-based systems is critical for safely deploying them in the real world, especially for security-critical applications, e.g., autonomous cars, video surveillance, and medical diagnosis. In this paper, we focus on physical adversarial attacks and provide a comprehensive survey of over 150 existing papers. We first clarify the concept of the physical adversarial attack and analyze its characteristics. Then, we define the adversarial medium, essential to perform attacks in the physical world. Next, we present the physical adversarial attack methods in task order: classification, detection, and re-identification, and introduce their performance in solving the trilemma: effectiveness, stealthiness, and robustness. In the end, we discuss the current challenges and potential future directions.
translated by 谷歌翻译
深度神经网络容易受到来自对抗性投入的攻击,并且最近,特洛伊木马误解或劫持模型的决定。我们通过探索有界抗逆性示例空间和生成的对抗网络内的自然输入空间来揭示有界面的对抗性实例 - 通用自然主义侵害贴片的兴趣类 - 我们呼叫TNT。现在,一个对手可以用一个自然主义的补丁来手臂自己,不太恶意,身体上可实现,高效 - 实现高攻击成功率和普遍性。 TNT是普遍的,因为在场景中的TNT中捕获的任何输入图像都将:i)误导网络(未确定的攻击);或ii)迫使网络进行恶意决定(有针对性的攻击)。现在,有趣的是,一个对抗性补丁攻击者有可能发挥更大的控制水平 - 选择一个独立,自然的贴片的能力,与被限制为嘈杂的扰动的触发器 - 到目前为止只有可能与特洛伊木马攻击方法有可能干扰模型建设过程,以嵌入风险发现的后门;但是,仍然意识到在物理世界中部署的补丁。通过对大型视觉分类任务的广泛实验,想象成在其整个验证集50,000张图像中进行评估,我们展示了TNT的现实威胁和攻击的稳健性。我们展示了攻击的概括,以创建比现有最先进的方法实现更高攻击成功率的补丁。我们的结果表明,攻击对不同的视觉分类任务(CIFAR-10,GTSRB,PUBFIG)和多个最先进的深神经网络,如WieredEnet50,Inception-V3和VGG-16。
translated by 谷歌翻译
Recent studies show that the state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input. Given that that emerging physical systems are using DNNs in safety-critical situations, adversarial examples could mislead these systems and cause dangerous situations. Therefore, understanding adversarial examples in the physical world is an important step towards developing resilient learning algorithms. We propose a general attack algorithm, Robust Physical Perturbations (RP 2 ), to generate robust visual adversarial perturbations under different physical conditions. Using the real-world case of road sign classification, we show that adversarial examples generated using RP 2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints. Due to the current lack of a standardized testing method, we propose a two-stage evaluation methodology for robust physical adversarial examples consisting of lab and field tests. Using this methodology, we evaluate the efficacy of physical adversarial manipulations on real objects. With a perturbation in the form of only black and white stickers, we attack a real stop sign, causing targeted misclassification in 100% of the images obtained in lab settings, and in 84.8% of the captured video frames obtained on a moving vehicle (field test) for the target classifier.
translated by 谷歌翻译
在过去的十年中,深度学习急剧改变了传统的手工艺特征方式,具有强大的功能学习能力,从而极大地改善了传统任务。然而,最近已经证明了深层神经网络容易受到对抗性例子的影响,这种恶意样本由小型设计的噪音制作,误导了DNNs做出错误的决定,同时仍然对人类无法察觉。对抗性示例可以分为数字对抗攻击和物理对抗攻击。数字对抗攻击主要是在实验室环境中进行的,重点是改善对抗性攻击算法的性能。相比之下,物理对抗性攻击集中于攻击物理世界部署的DNN系统,这是由于复杂的物理环境(即亮度,遮挡等),这是一项更具挑战性的任务。尽管数字对抗和物理对抗性示例之间的差异很小,但物理对抗示例具有特定的设计,可以克服复杂的物理环境的效果。在本文中,我们回顾了基于DNN的计算机视觉任务任务中的物理对抗攻击的开发,包括图像识别任务,对象检测任务和语义细分。为了完整的算法演化,我们将简要介绍不涉及身体对抗性攻击的作品。我们首先提出一个分类方案,以总结当前的物理对抗攻击。然后讨论现有的物理对抗攻击的优势和缺点,并专注于用于维持对抗性的技术,当应用于物理环境中时。最后,我们指出要解决的当前身体对抗攻击的问题并提供有前途的研究方向。
translated by 谷歌翻译
Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by adversarial examples that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model's prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives.This paper explores two feature squeezing methods: reducing the color bit depth of each pixel and spatial smoothing. These simple strategies are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.
translated by 谷歌翻译
近年来政府和商业实体的面部识别(FR)技术的快速采用提出了对公民自由和隐私的担忧。作为回应,已经开发了一套广泛的所谓“反面部识别”(AFR)工具,以帮助用户避免不需要的面部识别。在过去几年中提出的一组AFR工具是广泛的,快速发展,需要退回措施,以考虑AFR系统的更广泛的设计空间和长期挑战。本文旨在填补该差距,并提供对AFR研究景观的第一次综合分析。使用FR系统的运营级作为起点,我们创建了一个系统框架,用于分析不同AFR方法的益处和权衡。然后,我们考虑到AFR工具面临的技术和社会挑战,并提出在该领域的未来研究方向。
translated by 谷歌翻译
与令人印象深刻的进步触动了我们社会的各个方面,基于深度神经网络(DNN)的AI技术正在带来越来越多的安全问题。虽然在考试时间运行的攻击垄断了研究人员的初始关注,但是通过干扰培训过程来利用破坏DNN模型的可能性,代表了破坏训练过程的可能性,这是破坏AI技术的可靠性的进一步严重威胁。在后门攻击中,攻击者损坏了培训数据,以便在测试时间诱导错误的行为。然而,测试时间误差仅在存在与正确制作的输入样本对应的触发事件的情况下被激活。通过这种方式,损坏的网络继续正常输入的预期工作,并且只有当攻击者决定激活网络内隐藏的后门时,才会发生恶意行为。在过去几年中,后门攻击一直是强烈的研究活动的主题,重点是新的攻击阶段的发展,以及可能对策的提议。此概述文件的目标是审查发表的作品,直到现在,分类到目前为止提出的不同类型的攻击和防御。指导分析的分类基于攻击者对培训过程的控制量,以及防御者验证用于培训的数据的完整性,并监控DNN在培训和测试中的操作时间。因此,拟议的分析特别适合于参考他们在运营的应用方案的攻击和防御的强度和弱点。
translated by 谷歌翻译
计算能力和大型培训数据集的可用性增加,机器学习的成功助长了。假设它充分代表了在测试时遇到的数据,则使用培训数据来学习新模型或更新现有模型。这种假设受到中毒威胁的挑战,这种攻击会操纵训练数据,以损害模型在测试时的表现。尽管中毒已被认为是行业应用中的相关威胁,到目前为止,已经提出了各种不同的攻击和防御措施,但对该领域的完整系统化和批判性审查仍然缺失。在这项调查中,我们在机器学习中提供了中毒攻击和防御措施的全面系统化,审查了过去15年中该领域发表的100多篇论文。我们首先对当前的威胁模型和攻击进行分类,然后相应地组织现有防御。虽然我们主要关注计算机视觉应用程序,但我们认为我们的系统化还包括其他数据模式的最新攻击和防御。最后,我们讨论了中毒研究的现有资源,并阐明了当前的局限性和该研究领域的开放研究问题。
translated by 谷歌翻译
在对抗机器学习中,防止对深度学习系统的攻击的新防御能力在释放更强大的攻击后不久就会破坏。在这种情况下,法医工具可以通过追溯成功的根本原因来为现有防御措施提供宝贵的补充,并为缓解措施提供前进的途径,以防止将来采取类似的攻击。在本文中,我们描述了我们为开发用于深度神经网络毒物攻击的法医追溯工具的努力。我们提出了一种新型的迭代聚类和修剪解决方案,该解决方案修剪了“无辜”训练样本,直到所有剩余的是一组造成攻击的中毒数据。我们的方法群群训练样本基于它们对模型参数的影响,然后使用有效的数据解读方法来修剪无辜簇。我们从经验上证明了系统对三种类型的肮脏标签(后门)毒物攻击和三种类型的清洁标签毒药攻击的功效,这些毒物跨越了计算机视觉和恶意软件分类。我们的系统在所有攻击中都达到了98.4%的精度和96.8%的召回。我们还表明,我们的系统与专门攻击它的四种抗纤维法措施相对强大。
translated by 谷歌翻译
对象检测是各种关键计算机视觉任务的基础,例如分割,对象跟踪和事件检测。要以令人满意的精度训练对象探测器,需要大量数据。但是,由于注释大型数据集涉及大量劳动力,这种数据策展任务通常被外包给第三方或依靠志愿者。这项工作揭示了此类数据策展管道的严重脆弱性。我们提出MACAB,即使数据策展人可以手动审核图像,也可以将干净的图像制作清洁的图像将后门浸入对象探测器中。我们观察到,当后门被不明确的天然物理触发器激活时,在野外实现了错误分类和披肩的后门效应。与带有清洁标签的现有图像分类任务相比,带有清洁通道的非分类对象检测具有挑战性,这是由于每个帧内有多个对象的复杂性,包括受害者和非视野性对象。通过建设性地滥用深度学习框架使用的图像尺度函数,II结合了所提出的对抗性清洁图像复制技术,以及在考虑到毒品数据选择标准的情况下,通过建设性地滥用图像尺度尺度,可以确保MACAB的功效。广泛的实验表明,在各种现实世界中,MacAB在90%的攻击成功率中表现出超过90%的攻击成功率。这包括披肩和错误分类后门效应,甚至限制了较小的攻击预算。最先进的检测技术无法有效地识别中毒样品。全面的视频演示位于https://youtu.be/ma7l_lpxkp4上,该演示基于yolov4倒置的毒药率为0.14%,yolov4 clokaking后门和更快的速度R-CNN错误分类后门。
translated by 谷歌翻译
Adaptive attacks have (rightfully) become the de facto standard for evaluating defenses to adversarial examples. We find, however, that typical adaptive evaluations are incomplete. We demonstrate that thirteen defenses recently published at ICLR, ICML and NeurIPS-and which illustrate a diverse set of defense strategies-can be circumvented despite attempting to perform evaluations using adaptive attacks. While prior evaluation papers focused mainly on the end result-showing that a defense was ineffective-this paper focuses on laying out the methodology and the approach necessary to perform an adaptive attack. Some of our attack strategies are generalizable, but no single strategy would have been sufficient for all defenses. This underlines our key message that adaptive attacks cannot be automated and always require careful and appropriate tuning to a given defense. We hope that these analyses will serve as guidance on how to properly perform adaptive attacks against defenses to adversarial examples, and thus will allow the community to make further progress in building more robust models.
translated by 谷歌翻译
Learning-based pattern classifiers, including deep networks, have shown impressive performance in several application domains, ranging from computer vision to cybersecurity. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to such wild patterns (also referred to as adversarial examples), along with the design of suitable countermeasures, have been investigated in the research field of adversarial machine learning. In this work, we provide a thorough overview of the evolution of this research area over the last ten years and beyond, starting from pioneering, earlier work on the security of non-deep learning algorithms up to more recent work aimed to understand the security properties of deep learning algorithms, in the context of computer vision and cybersecurity tasks. We report interesting connections between these apparently-different lines of work, highlighting common misconceptions related to the security evaluation of machine-learning algorithms. We review the main threat models and attacks defined to this end, and discuss the main limitations of current work, along with the corresponding future challenges towards the design of more secure learning algorithms.
translated by 谷歌翻译
许多最先进的ML模型在各种任务中具有优于图像分类的人类。具有如此出色的性能,ML模型今天被广泛使用。然而,存在对抗性攻击和数据中毒攻击的真正符合ML模型的稳健性。例如,Engstrom等人。证明了最先进的图像分类器可以容易地被任意图像上的小旋转欺骗。由于ML系统越来越纳入安全性和安全敏感的应用,对抗攻击和数据中毒攻击构成了相当大的威胁。本章侧重于ML安全的两个广泛和重要的领域:对抗攻击和数据中毒攻击。
translated by 谷歌翻译
Recent years have seen a proliferation of research on adversarial machine learning. Numerous papers demonstrate powerful algorithmic attacks against a wide variety of machine learning (ML) models, and numerous other papers propose defenses that can withstand most attacks. However, abundant real-world evidence suggests that actual attackers use simple tactics to subvert ML-driven systems, and as a result security practitioners have not prioritized adversarial ML defenses. Motivated by the apparent gap between researchers and practitioners, this position paper aims to bridge the two domains. We first present three real-world case studies from which we can glean practical insights unknown or neglected in research. Next we analyze all adversarial ML papers recently published in top security conferences, highlighting positive trends and blind spots. Finally, we state positions on precise and cost-driven threat modeling, collaboration between industry and academia, and reproducible research. We believe that our positions, if adopted, will increase the real-world impact of future endeavours in adversarial ML, bringing both researchers and practitioners closer to their shared goal of improving the security of ML systems.
translated by 谷歌翻译
最近的作品表明,深度学习模型容易受到后门中毒攻击的影响,在这些攻击中,这些攻击灌输了与外部触发模式或物体(例如贴纸,太阳镜等)的虚假相关性。我们发现这种外部触发信号是不必要的,因为可以使用基于旋转的图像转换轻松插入高效的后门。我们的方法通过旋转有限数量的对象并将其标记错误来构建中毒数据集;一旦接受过培训,受害者的模型将在运行时间推理期间做出不良的预测。它表现出明显的攻击成功率,同时通过有关图像分类和对象检测任务的全面实证研究来保持清洁绩效。此外,我们评估了标准数据增强技术和针对我们的攻击的四种不同的后门防御措施,发现它们都无法作为一致的缓解方法。正如我们在图像分类和对象检测应用程序中所示,我们的攻击只能在现实世界中轻松部署在现实世界中。总体而言,我们的工作突出了一个新的,简单的,物理上可实现的,高效的矢量,用于后门攻击。我们的视频演示可在https://youtu.be/6jif8wnx34m上找到。
translated by 谷歌翻译
Video compression plays a crucial role in video streaming and classification systems by maximizing the end-user quality of experience (QoE) at a given bandwidth budget. In this paper, we conduct the first systematic study for adversarial attacks on deep learning-based video compression and downstream classification systems. Our attack framework, dubbed RoVISQ, manipulates the Rate-Distortion ($\textit{R}$-$\textit{D}$) relationship of a video compression model to achieve one or both of the following goals: (1) increasing the network bandwidth, (2) degrading the video quality for end-users. We further devise new objectives for targeted and untargeted attacks to a downstream video classification service. Finally, we design an input-invariant perturbation that universally disrupts video compression and classification systems in real time. Unlike previously proposed attacks on video classification, our adversarial perturbations are the first to withstand compression. We empirically show the resilience of RoVISQ attacks against various defenses, i.e., adversarial training, video denoising, and JPEG compression. Our extensive experimental results on various video datasets show RoVISQ attacks deteriorate peak signal-to-noise ratio by up to 5.6dB and the bit-rate by up to $\sim$ 2.4$\times$ while achieving over 90$\%$ attack success rate on a downstream classifier. Our user study further demonstrates the effect of RoVISQ attacks on users' QoE.
translated by 谷歌翻译
本文审查了视觉隐私保护技术中最先进的技术,特别注意适用于主动和辅助生活领域的技术(aal)。介绍了一种新的分类,可以归类最先进的视觉隐私保护方法。突出显示了传教性的感知混淆方法,是分类学中的一个类别。这些是一类视觉隐私保存技术,特别是在考虑基于视频的AAL监控的情况时特别相关。还探讨了对机器学习模型的混淆。设计的不同隐私层面的高级分类方案与视觉隐私保存技术的拟议分类有关。最后,我们注意到现场存在的开放问题,并将读者介绍给一些令人兴奋的途径,以便在视觉隐私区域的未来研究。
translated by 谷歌翻译