智能论文笔记

Look Closer to Your Enemy: Learning to Attack via Teacher-student Mimicking

Mingejie Wang , Zhiqing Tang , Sirui Li , Dingwen Xiao

分类：计算机视觉

2022-07-27

本文旨在通过阅读敌人的思想（VM）来生成现实的人重新识别Reid的攻击样本。在本文中，我们提出了一种新颖的不起眼且可控的REID攻击基线LCYE，以生成对抗性查询图像。具体而言，LCYE首先通过模仿代理任务的教师学生记忆来提炼VM的知识。然后，这种知识的事先充当明确的密码，传达了VM所相信的必不可少和现实的内容，以实现准确的对抗性误导。此外，从LCYE的多个相反任务框架中受益，我们从对抗性攻击的角度进一步研究了REID模型的解释性和概括，包括跨域适应，跨模型共识和在线学习过程。对四个REID基准测试的广泛实验表明，我们的方法的表现优于其他最先进的攻击者，而白色框，黑框和目标攻击的边距很大。我们的代码现在可在https://gitfront.io/r/user-3704489/mkxusqdt4ffr/lcye/上找到。

translated by 谷歌翻译

Towards Efficiently Evaluating the Robustness of Deep Neural Networks in IoT Systems: A GAN-based Method

Tao Bai , Jun Zhao , Jinlin Zhu , Shoudong Han , Jiefeng Chen , Bo Li , Alex Kot

分类：机器学习

2021-11-19

基于深度神经网络（DNN）的智能信息（IOT）系统已被广泛部署在现实世界中。然而，发现DNNS易受对抗性示例的影响，这提高了人们对智能物联网系统的可靠性和安全性的担忧。测试和评估IOT系统的稳健性成为必要和必要。最近已经提出了各种攻击和策略，但效率问题仍未纠正。现有方法是计算地广泛或耗时，这在实践中不适用。在本文中，我们提出了一种称为攻击启发GaN（AI-GaN）的新框架，在有条件地产生对抗性实例。曾经接受过培训，可以有效地给予对抗扰动的输入图像和目标类。我们在白盒设置的不同数据集中应用AI-GaN，黑匣子设置和由最先进的防御保护的目标模型。通过广泛的实验，AI-GaN实现了高攻击成功率，优于现有方法，并显着降低了生成时间。此外，首次，AI-GaN成功地缩放到复杂的数据集。 Cifar-100和Imagenet，所有课程中的成功率约为90美元。

translated by 谷歌翻译

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition

Qidong Huang , Xiaoyi Dong , Dongdong Chen , Hang Zhou , Weiming Zhang , Kui Zhang , Gang Hua , Nenghai Yu

分类：计算机视觉

2022-09-16

尽管在各种应用中取得了突出的性能，但点云识别模型经常遭受自然腐败和对抗性扰动的困扰。在本文中，我们深入研究了点云识别模型的一般鲁棒性，并提出了点云对比对抗训练（PointCat）。 PointCat的主要直觉是鼓励目标识别模型缩小清洁点云和损坏点云之间的决策差距。具体而言，我们利用有监督的对比损失来促进识别模型提取的超晶体特征的对齐和均匀性，并设计一对带有动态原型指南的集中式损失，以避免这些特征与其属于其属于其归属类别群的偏离。为了提供更具挑战性的损坏点云，我们对噪声生成器以及从头开始的识别模型进行了对手训练，而不是将基于梯度的攻击用作内部循环，例如以前的对手训练方法。全面的实验表明，在包括各种损坏的情况下，所提出的PointCat优于基线方法，并显着提高不同点云识别模型的稳健性，包括各向同性点噪声，LIDAR模拟的噪声，随机点掉落和对抗性扰动。

translated by 谷歌翻译

A Survey on Physical Adversarial Attack in Computer Vision

Donghua Wang , Wen Yao , Tingsong Jiang , Guijiang Tang , Xiaoqian Chen

分类：计算机视觉

2022-09-28

在过去的十年中，深度学习急剧改变了传统的手工艺特征方式，具有强大的功能学习能力，从而极大地改善了传统任务。然而，最近已经证明了深层神经网络容易受到对抗性例子的影响，这种恶意样本由小型设计的噪音制作，误导了DNNs做出错误的决定，同时仍然对人类无法察觉。对抗性示例可以分为数字对抗攻击和物理对抗攻击。数字对抗攻击主要是在实验室环境中进行的，重点是改善对抗性攻击算法的性能。相比之下，物理对抗性攻击集中于攻击物理世界部署的DNN系统，这是由于复杂的物理环境（即亮度，遮挡等），这是一项更具挑战性的任务。尽管数字对抗和物理对抗性示例之间的差异很小，但物理对抗示例具有特定的设计，可以克服复杂的物理环境的效果。在本文中，我们回顾了基于DNN的计算机视觉任务任务中的物理对抗攻击的开发，包括图像识别任务，对象检测任务和语义细分。为了完整的算法演化，我们将简要介绍不涉及身体对抗性攻击的作品。我们首先提出一个分类方案，以总结当前的物理对抗攻击。然后讨论现有的物理对抗攻击的优势和缺点，并专注于用于维持对抗性的技术，当应用于物理环境中时。最后，我们指出要解决的当前身体对抗攻击的问题并提供有前途的研究方向。

translated by 谷歌翻译

TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep Neural Network Systems

Bao Gia Doan , Minhui Xue , Shiqing Ma , Ehsan Abbasnejad , Damith C. Ranasinghe

分类：计算机视觉

2021-11-19

深度神经网络容易受到来自对抗性投入的攻击，并且最近，特洛伊木马误解或劫持模型的决定。我们通过探索有界抗逆性示例空间和生成的对抗网络内的自然输入空间来揭示有界面的对抗性实例 - 通用自然主义侵害贴片的兴趣类 - 我们呼叫TNT。现在，一个对手可以用一个自然主义的补丁来手臂自己，不太恶意，身体上可实现，高效 - 实现高攻击成功率和普遍性。 TNT是普遍的，因为在场景中的TNT中捕获的任何输入图像都将：i）误导网络（未确定的攻击）;或ii）迫使网络进行恶意决定（有针对性的攻击）。现在，有趣的是，一个对抗性补丁攻击者有可能发挥更大的控制水平 - 选择一个独立，自然的贴片的能力，与被限制为嘈杂的扰动的触发器 - 到目前为止只有可能与特洛伊木马攻击方法有可能干扰模型建设过程，以嵌入风险发现的后门;但是，仍然意识到在物理世界中部署的补丁。通过对大型视觉分类任务的广泛实验，想象成在其整个验证集50,000张图像中进行评估，我们展示了TNT的现实威胁和攻击的稳健性。我们展示了攻击的概括，以创建比现有最先进的方法实现更高攻击成功率的补丁。我们的结果表明，攻击对不同的视觉分类任务（CIFAR-10，GTSRB，PUBFIG）和多个最先进的深神经网络，如WieredEnet50，Inception-V3和VGG-16。

translated by 谷歌翻译

Robust Person Re-identification with Multi-Modal Joint Defence

Yunpeng Gong , Lifei Chen

分类：计算机视觉

2021-11-18

已经证明，基于度量学习的人重新识别（Reid）系统继承了深度神经网络（DNN）的脆弱性，这很容易被普通ararar公制攻击所迷惑。现有的工作主要依赖于对公制防御的对抗培训，并且没有完全研究更多方法。通过探索攻击对潜在特征的影响，我们提出了针对度量攻击和防御方法的有针对性的方法。在公制攻击方面，我们使用本地颜色偏差来构建输入的类内变化以攻击颜色特征。在公制防御方面，我们提出了一种联合防御方法，包括两个主动防御和被动防御的部分。主动防御有助于通过构建来自多模式图像的不同输入来增强模型到色彩变化的鲁棒性和多种方式的结构关系的学习，并且被动防御通过迂回缩放来利用变化像素空间中的结构特征的不变性以保护结构特征在消除一些对抗噪声的同时。广泛的实验表明，拟议的联合防御与现有的对抗公制防御方法相比，不仅与同时进行多次攻击而且也没有显着降低模型的泛化能力。代码可在https://github.com/finger-monkey/multi-modal_joint_defence上获得。

translated by 谷歌翻译

Adversarial Attack across Datasets

Yunxiao Qin , Yuanhao Xiong , Jinfeng Yi , Lihong Cao , Cho-Jui Hsieh

分类：计算机视觉

2021-10-13

现有的转移攻击方法通常假定攻击者知道黑盒受害者模型的训练集（例如标签集，输入大小），这通常是不现实的，因为在某些情况下，攻击者不知道此信息。在本文中，我们定义了一个通用的可转移攻击（GTA）问题，在该问题中，攻击者不知道此信息，并获得攻击可能来自未知数据集的任何随机遇到的图像。为了解决GTA问题，我们提出了一种新颖的图像分类橡皮擦（ICE），该图像分类（ICE）训练特定的攻击者从任意数据集中擦除任何图像的分类信息。几个数据集的实验表明，ICE在GTA上的现有转移攻击极大地胜过了转移攻击，并表明ICE使用类似纹理的噪声来扰动不同数据集的不同图像。此外，快速傅立叶变换分析表明，每个冰噪声中的主要成分是R，G和B图像通道的三个正弦波。受这个有趣的发现的启发，我们设计了一种新颖的正弦攻击方法（SA），以优化三个正弦波。实验表明，SA的性能与冰相当，表明这三个正弦波是有效的，足以打破GTA设置下的DNN。

translated by 谷歌翻译

Generalizable Black-Box Adversarial Attack with Meta Learning

Fei Yin , Yong Zhang , Baoyuan Wu , Yan Feng , Jingyi Zhang , Yanbo Fan , Yujiu Yang

分类：机器学习 | 计算机视觉

2023-01-01

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.

translated by 谷歌翻译

A Survey of Face Recognition

Xinyi Wang , Jianteng Peng , Sufang Zhang , Bihui Chen , Yi Wang , Yandong Guo

分类：计算机视觉

2022-12-26

Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important role in human life such as device unlock, mobile payment, and so on. This paper provides an introduction to face recognition, including its history, pipeline, algorithms based on conventional manually designed features or deep learning, mainstream training, evaluation datasets, and related applications. We have analyzed and compared state-of-the-art works as many as possible, and also carefully designed a set of experiments to find the effect of backbone size and data distribution. This survey is a material of the tutorial named The Practical Face Recognition Technology in the Industrial World in the FG2023.

translated by 谷歌翻译

StyleFool: Fooling Video Classification Systems via Style Transfer

Yuxin Cao , Xi Xiao , Ruoxi Sun , Derui Wang , Minhui Xue , Sheng Wen

分类：计算机视觉

2022-03-30

Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with restricted perturbations are ineffective against defenses such as denoising or adversarial training. In this paper, we focus on unrestricted perturbations and propose StyleFool, a black-box video adversarial attack via style transfer to fool the video classification system. StyleFool first utilizes color theme proximity to select the best style image, which helps avoid unnatural details in the stylized videos. Meanwhile, the target class confidence is additionally considered in targeted attacks to influence the output distribution of the classifier by moving the stylized video closer to or even across the decision boundary. A gradient-free method is then employed to further optimize the adversarial perturbations. We carry out extensive experiments to evaluate StyleFool on two standard datasets, UCF-101 and HMDB-51. The experimental results demonstrate that StyleFool outperforms the state-of-the-art adversarial attacks in terms of both the number of queries and the robustness against existing defenses. Moreover, 50% of the stylized videos in untargeted attacks do not need any query since they can already fool the video classification model. Furthermore, we evaluate the indistinguishability through a user study to show that the adversarial samples of StyleFool look imperceptible to human eyes, despite unrestricted perturbations.

translated by 谷歌翻译

Adversarial examples: Attacks and defenses for deep learning

分类：

With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks have been recently found vulnerable to well-designed input samples, called adversarial examples. Adversarial perturbations are imperceptible to human but can easily fool deep neural networks in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying deep neural networks in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for deep neural networks, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples. In addition, three major challenges in adversarial examples and the potential solutions are discussed.

translated by 谷歌翻译

Data-Free Knowledge Transfer: A Survey

Yuang Liu , Wei Zhang , Jun Wang , Jianyong Wang

分类：机器学习 | 计算机视觉

2021-12-31

在过去的十年中，许多深入学习模型都受到了良好的培训，并在各种机器智能领域取得了巨大成功，特别是对于计算机视觉和自然语言处理。为了更好地利用这些训练有素的模型在域内或跨域转移学习情况下，提出了知识蒸馏（KD）和域适应（DA）并成为研究亮点。他们旨在通过原始培训数据从训练有素的模型转移有用的信息。但是，由于隐私，版权或机密性，原始数据并不总是可用的。最近，无数据知识转移范式吸引了吸引人的关注，因为它涉及从训练有素的模型中蒸馏宝贵的知识，而无需访问培训数据。特别是，它主要包括无数据知识蒸馏（DFKD）和源无数据域适应（SFDA）。一方面，DFKD旨在将域名域内知识从一个麻烦的教师网络转移到一个紧凑的学生网络，以进行模型压缩和有效推论。另一方面，SFDA的目标是重用存储在训练有素的源模型中的跨域知识并将其调整为目标域。在本文中，我们对知识蒸馏和无监督域适应的视角提供了全面的数据知识转移，以帮助读者更好地了解目前的研究状况和想法。分别简要审查了这两个领域的应用和挑战。此外，我们对未来研究的主题提供了一些见解。

translated by 谷歌翻译

Deep Learning for Person Re-identification: A Survey and Outlook

Mang Ye , Jianbing Shen , Gaojie Lin , Tao Xiang , Ling Shao , Steven C. H. Hoi

分类：

2020-01-13

Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for FOUR different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed.

translated by 谷歌翻译

Inconspicuous Adversarial Patches for Fooling Image Recognition Systems on Mobile Devices

Tao Bai , Jinqi Luo , Jun Zhao

分类：计算机视觉 | 人工智能

2021-06-29

基于深度学习的图像识别系统已广泛部署在当今世界的移动设备上。然而，在最近的研究中，深入学习模型被证明易受对抗的例子。一种逆势例的一个变种，称为对抗性补丁，由于其强烈的攻击能力而引起了研究人员的注意。虽然对抗性补丁实现了高攻击成功率，但由于补丁和原始图像之间的视觉不一致，它们很容易被检测到。此外，它通常需要对文献中的对抗斑块产生的大量数据，这是计算昂贵且耗时的。为了解决这些挑战，我们提出一种方法来产生具有一个单一图像的不起眼的对抗性斑块。在我们的方法中，我们首先通过利用多尺度发生器和鉴别器来决定基于受害者模型的感知敏感性的补丁位置，然后以粗糙的方式产生对抗性斑块。鼓励修补程序与具有对抗性训练的背景图像一致，同时保留强烈的攻击能力。我们的方法显示了白盒设置中的强烈攻击能力以及通过对具有不同架构和培训方法的各种型号的广泛实验，通过广泛的实验进行黑盒设置的优异转移性。与其他对抗贴片相比，我们的对抗斑块具有最大忽略的风险，并且可以避免人类观察，这是由显着性图和用户评估结果的插图支持的人类观察。最后，我们表明我们的对抗性补丁可以应用于物理世界。

translated by 谷歌翻译

QueryNet: Attack by Multi-Identity Surrogates

Sizhe Chen , Zhehao Huang , Qinghua Tao , Xiaolin Huang

分类：机器学习

2021-05-31

深度神经网络（DNN）被视为易受对抗性攻击的影响，而现有的黑匣子攻击需要广泛查询受害者DNN以实现高成功率。对于查询效率，由于它们的梯度相似度（GS），即代理的攻击梯度与受害者的攻击梯度类似，因此使用受害者的代理模型来生成可转移的对抗性示例（AES）。但是，通常忽略了它们对输出的相似性，即预测相似性（PS），以在不查询受害者的情况下通过代理过滤效率低效查询。要共同利用和还优化代理者的GS和PS，我们开发QueryNet，一个可以显着减少查询的统一攻击框架。 Querynet通过多识别代理人创造性地攻击，即通过不同的代理商为一个样本工艺几个AES，并且还使用代理人来决定查询最有前途的AE。之后，受害者的查询反馈累积以优化代理人的参数，还可以优化其架构，增强GS和PS。虽然Querynet无法获得预先接受预先训练的代理人，但根据我们的综合实验，它与可接受的时间内的替代方案相比，它会降低查询。 ImageNet，只允许8位图像查询，无法访问受害者的培训数据。代码可在https://github.com/allenchen1998/querynet上获得。

translated by 谷歌翻译

Source-Free Unsupervised Domain Adaptation: A Survey

Yuqi Fang , Pew-Thian Yap , Weili Lin , Hongtu Zhu , Mingxia Liu

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-31

Unsupervised domain adaptation (UDA) via deep learning has attracted appealing attention for tackling domain-shift problems caused by distribution discrepancy across different domains. Existing UDA approaches highly depend on the accessibility of source domain data, which is usually limited in practical scenarios due to privacy protection, data storage and transmission cost, and computation burden. To tackle this issue, many source-free unsupervised domain adaptation (SFUDA) methods have been proposed recently, which perform knowledge transfer from a pre-trained source model to unlabeled target domain with source data inaccessible. A comprehensive review of these works on SFUDA is of great significance. In this paper, we provide a timely and systematic literature review of existing SFUDA approaches from a technical perspective. Specifically, we categorize current SFUDA studies into two groups, i.e., white-box SFUDA and black-box SFUDA, and further divide them into finer subcategories based on different learning strategies they use. We also investigate the challenges of methods in each subcategory, discuss the advantages/disadvantages of white-box and black-box SFUDA methods, conclude the commonly used benchmark datasets, and summarize the popular techniques for improved generalizability of models learned without using source data. We finally discuss several promising future directions in this field.

translated by 谷歌翻译

Imperceptible Transfer Attack and Defense on 3D Point Cloud Classification

Daizong Liu , Wei Hu

分类：计算机视觉

2021-11-22

虽然近年来，在2D图像领域的攻击和防御中，许多努力已经探讨了3D模型的脆弱性。现有的3D攻击者通常在点云上执行点明智的扰动，从而导致变形的结构或异常值，这很容易被人类察觉。此外，它们的对抗示例是在白盒设置下产生的，当转移到攻击远程黑匣子型号时经常遭受低成功率。在本文中，我们通过提出一种新的难以察觉的转移攻击（ITA）：1）难以察觉的3D点云攻击来自两个新的和具有挑战性的观点：1）难以察觉：沿着邻域表面的正常向量限制每个点的扰动方向，导致产生具有类似几何特性的示例，从而增强了难以察觉。 2）可转移性：我们开发了一个对抗性转变模型，以产生最有害的扭曲，并强制实施对抗性示例来抵抗它，从而提高其对未知黑匣子型号的可转移性。此外，我们建议通过学习更辨别的点云表示来培训更强大的黑盒3D模型来防御此类ITA攻击。广泛的评估表明，我们的ITA攻击比最先进的人更令人无法察觉和可转让，并验证我们的国防战略的优势。

translated by 谷歌翻译

Harnessing Perceptual Adversarial Patches for Crowd Counting

Shunchang Liu , Jiakai Wang , Aishan Liu , Yingwei Li , Yijie Gao , Xianglong Liu , Dacheng Tao

分类：计算机视觉

2021-09-16

人群计数已被广泛用于估计安全至关重要的场景中的人数，被证明很容易受到物理世界中对抗性例子的影响（例如，对抗性斑块）。尽管有害，但对抗性例子也很有价值，对于评估和更好地理解模型的鲁棒性也很有价值。但是，现有的对抗人群计算的对抗性示例生成方法在不同的黑盒模型之间缺乏强大的可传递性，这限制了它们对现实世界系统的实用性。本文提出了与模型不变特征正相关的事实，本文提出了感知的对抗贴片（PAP）生成框架，以使用模型共享的感知功能来定制对对抗性的扰动。具体来说，我们将一种自适应人群密度加权方法手工制作，以捕获各种模型中不变的量表感知特征，并利用密度引导的注意力来捕获模型共享的位置感知。证明它们都可以提高我们对抗斑块的攻击性转移性。广泛的实验表明，我们的PAP可以在数字世界和物理世界中实现最先进的进攻性能，并且以大幅度的优于以前的提案（最多+685.7 MAE和+699.5 MSE）。此外，我们从经验上证明，对我们的PAP进行的对抗训练可以使香草模型的性能受益，以减轻人群计数的几个实际挑战，包括跨数据集的概括（高达-376.0 MAE和-376.0 MAE和-354.9 MSE）和对复杂背景的鲁棒性（上升）至-10.3 MAE和-16.4 MSE）。

translated by 谷歌翻译

Interpretations Cannot Be Trusted: Stealthy and Effective Adversarial Perturbations against Interpretable Deep Learning

Eldor Abdukhamidov , Mohammed Abuhamad , Simon S. Woo , Eric Chan-Tin , Tamer Abuhmed

分类：计算机视觉 | 机器学习

2022-11-29

Deep learning methods have gained increased attention in various applications due to their outstanding performance. For exploring how this high performance relates to the proper use of data artifacts and the accurate problem formulation of a given task, interpretation models have become a crucial component in developing deep learning-based systems. Interpretation models enable the understanding of the inner workings of deep learning models and offer a sense of security in detecting the misuse of artifacts in the input data. Similar to prediction models, interpretation models are also susceptible to adversarial inputs. This work introduces two attacks, AdvEdge and AdvEdge$^{+}$, that deceive both the target deep learning model and the coupled interpretation model. We assess the effectiveness of proposed attacks against two deep learning model architectures coupled with four interpretation models that represent different categories of interpretation models. Our experiments include the attack implementation using various attack frameworks. We also explore the potential countermeasures against such attacks. Our analysis shows the effectiveness of our attacks in terms of deceiving the deep learning models and their interpreters, and highlights insights to improve and circumvent the attacks.

translated by 谷歌翻译

Publishing Efficient On-device Models Increases Adversarial Vulnerability

Sanghyun Hong , Nicholas Carlini , Alexey Kurakin

分类：机器学习

2022-12-28

Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the insights, we propose a defense, $similarity$-$unpairing$, that fine-tunes on-device models with the objective of reducing the similarity. We evaluated our defense on all the 19 DNNs and found that it reduces the transferability up to 90% and the number of queries required by a factor of 10-100x. Our results suggest that further research is needed on the security (or even privacy) threats caused by publishing those efficient siblings.

translated by 谷歌翻译