智能论文笔记

SPRITZ-1.5C: Employing Deep Ensemble Learning for Improving the Security of Computer Networks against Adversarial Attacks

Ehsan Nowroozi , Mohammadreza Mohammadi , Erkay Savas , Mauro Conti , Yassine Mekdad

分类：人工智能 | 机器学习

2022-09-25

在过去的几年中，卷积神经网络（CNN）在各种现实世界的网络安全应用程序（例如网络和多媒体安全）中表现出了有希望的性能。但是，CNN结构的潜在脆弱性构成了主要的安全问题，因此不适合用于以安全为导向的应用程序，包括此类计算机网络。保护这些体系结构免受对抗性攻击，需要使用挑战性攻击的安全体系结构。在这项研究中，我们提出了一种基于合奏分类器的新型体系结构，该结构将1级分类（称为1C）的增强安全性与在没有攻击的情况下的传统2级分类（称为2C）的高性能结合在一起。我们的体系结构称为1.5级（Spritz-1.5c）分类器，并使用最终密度分类器，一个2C分类器（即CNNS）和两个并行1C分类器（即自动编码器）构造。在我们的实验中，我们通过在各种情况下考虑八次可能的对抗性攻击来评估我们提出的架构的鲁棒性。我们分别对2C和Spritz-1.5c体系结构进行了这些攻击。我们研究的实验结果表明，I-FGSM攻击对2C分类器的攻击成功率（ASR）是N-Baiot数据集训练的2C分类器的0.9900。相反，Spritz-1.5C分类器的ASR为0.0000。

translated by 谷歌翻译

Resisting Deep Learning Models Against Adversarial Attack Transferability via Feature Randomization

Ehsan Nowroozi , Mohammadreza Mohammadi , Pargol Golmohammadi , Yassine Mekdad , Mauro Conti , Selcuk Uluagac

分类：机器学习

2022-09-11

在过去的几十年中，人工智能的兴起使我们有能力解决日常生活中最具挑战性的问题，例如癌症的预测和自主航行。但是，如果不保护对抗性攻击，这些应用程序可能不会可靠。此外，最近的作品表明，某些对抗性示例可以在不同的模型中转移。因此，至关重要的是避免通过抵抗对抗性操纵的强大模型进行这种可传递性。在本文中，我们提出了一种基于特征随机化的方法，该方法抵抗了八次针对测试阶段深度学习模型的对抗性攻击。我们的新方法包括改变目标网络分类器中的训练策略并选择随机特征样本。我们认为攻击者具有有限的知识和半知识条件，以进行最普遍的对抗性攻击。我们使用包括现实和合成攻击的众所周知的UNSW-NB15数据集评估了方法的鲁棒性。之后，我们证明我们的策略优于现有的最新方法，例如最强大的攻击，包括针对特定的对抗性攻击进行微调网络模型。最后，我们的实验结果表明，我们的方法可以确保目标网络并抵抗对抗性攻击的转移性超过60％。

translated by 谷歌翻译

Adversarial Machine Learning In Network Intrusion Detection Domain: A Systematic Review

Huda Ali Alatwi , Charles Morisset

分类：机器学习 | 神经与进化计算

2021-12-06

由于它们在各个域中的大量成功，深入的学习技术越来越多地用于设计网络入侵检测解决方案，该解决方案检测和减轻具有高精度检测速率和最小特征工程的未知和已知的攻击。但是，已经发现，深度学习模型容易受到可以误导模型的数据实例，以使所谓的分类决策不正确（对抗示例）。此类漏洞允许攻击者通过向恶意流量添加小的狡猾扰动来逃避检测并扰乱系统的关键功能。在计算机视觉域中广泛研究了深度对抗学习的问题;但是，它仍然是网络安全应用中的开放研究领域。因此，本调查探讨了在网络入侵检测领域采用对抗机器学习的不同方面的研究，以便为潜在解决方案提供方向。首先，调查研究基于它们对产生对抗性实例的贡献来分类，评估ML的NID对逆势示例的鲁棒性，并捍卫这些模型的这种攻击。其次，我们突出了调查研究中确定的特征。此外，我们讨论了现有的通用对抗攻击对NIDS领域的适用性，启动拟议攻击在现实世界方案中的可行性以及现有缓解解决方案的局限性。

translated by 谷歌翻译

Adversarial examples: Attacks and defenses for deep learning

分类：

With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks have been recently found vulnerable to well-designed input samples, called adversarial examples. Adversarial perturbations are imperceptible to human but can easily fool deep neural networks in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying deep neural networks in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for deep neural networks, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples. In addition, three major challenges in adversarial examples and the potential solutions are discussed.

translated by 谷歌翻译

ML Attack Models: Adversarial Attacks and Data Poisoning Attacks

Jing Lin , Long Dang , Mohamed Rahouti , Kaiqi Xiong

分类：机器学习

2021-12-06

许多最先进的ML模型在各种任务中具有优于图像分类的人类。具有如此出色的性能，ML模型今天被广泛使用。然而，存在对抗性攻击和数据中毒攻击的真正符合ML模型的稳健性。例如，Engstrom等人。证明了最先进的图像分类器可以容易地被任意图像上的小旋转欺骗。由于ML系统越来越纳入安全性和安全敏感的应用，对抗攻击和数据中毒攻击构成了相当大的威胁。本章侧重于ML安全的两个广泛和重要的领域：对抗攻击和数据中毒攻击。

translated by 谷歌翻译

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Naveed Akhtar , Ajmal Mian

分类：

2018-01-02

The authors thank Nicholas Carlini (UC Berkeley) and Dimitris Tsipras (MIT) for feedback to improve the survey quality. We also acknowledge X. Huang (Uni. Liverpool), K. R. Reddy (IISC), E. Valle (UNICAMP), Y. Yoo (CLAIR) and others for providing pointers to make the survey more comprehensive.

translated by 谷歌翻译

Adversarial Example Detection for DNN Models: A Review and Experimental Comparison

Ahmed Aldahdooh , Wassim Hamidouche , Sid Ahmed Fezza , Olivier Deforges

分类：计算机视觉

2021-05-01

深度学习（DL）在许多与人类相关的任务中表现出巨大的成功，这导致其在许多计算机视觉的基础应用中采用，例如安全监控系统，自治车辆和医疗保健。一旦他们拥有能力克服安全关键挑战，这种安全关键型应用程序必须绘制他们的成功部署之路。在这些挑战中，防止或/和检测对抗性实例（AES）。对手可以仔细制作小型，通常是难以察觉的，称为扰动的噪声被添加到清洁图像中以产生AE。 AE的目的是愚弄DL模型，使其成为DL应用的潜在风险。在文献中提出了许多测试时间逃避攻击和对策，即防御或检测方法。此外，还发布了很少的评论和调查，理论上展示了威胁的分类和对策方法，几乎没有焦点检测方法。在本文中，我们专注于图像分类任务，并试图为神经网络分类器进行测试时间逃避攻击检测方法的调查。对此类方法的详细讨论提供了在四个数据集的不同场景下的八个最先进的探测器的实验结果。我们还为这一研究方向提供了潜在的挑战和未来的观点。

translated by 谷歌翻译

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Pedro Miguel Sánchez Sánchez , Alberto Huertas Celdrán , Gérôme Bovet , Gregorio Martínez Pérez

分类：人工智能

2022-12-30

In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance.

translated by 谷歌翻译

Design of secure and robust cognitive system for malware detection

Sanket Shukla

分类：机器学习

2022-08-03

基于机器学习的恶意软件检测技术依赖于恶意软件的灰度图像，并且倾向于根据灰色图像中纹理的分布对恶意软件进行分类。尽管机器学习技术显示出的进步和有希望的结果，但攻击者可以通过生成对抗样本来利用漏洞。对抗样本是通过智能手工制作并向输入样品添加扰动来生成的。大多数基于软件的对抗性攻击和防御。为了防御对手，基于机器学习和灰度图像的现有恶意软件检测需要对对抗数据进行预处理。这可能会导致额外的开销，并可以延长实时恶意软件检测。因此，作为替代方案，我们探索了基于RRAM（电阻随机访问记忆）对对手的防御。因此，本文的目的是解决上述关键系统安全问题。上述挑战是通过展示提出的技术来设计安全和健壮的认知系统来解决的。首先，提出了一种新的检测隐形恶意软件的技术。该技术使用恶意软件二进制图像，然后从同一图像中提取不同的功能，然后在数据集中使用不同的ML分类器。结果表明，基于提取的功能，该技术在区分恶意软件类别中成功。其次，我演示了对抗性攻击对具有不同学习算法和设备特征的可重新配置RRAM-NEUROMORMORMORMORMORMORMORORMORMORORMORMORMORORMORMORORMORMORORMORMORORMORMORORMORMORORMORMORORMORORMORORMORORMORORMORORMORORMORORMORMOROROMORMORORMORORMORORITIC的影响。我还提出了一种集成解决方案，用于使用可重新配置的RRAM体系结构来减轻对抗攻击的影响。

translated by 谷歌翻译

Statistical Detection of Adversarial examples in Blockchain-based Federated Forest In-vehicle Network Intrusion Detection Systems

Ibrahim Aliyu , Selinde van Engelenburg , Muhammed Bashir Muazu , Jinsul Kim , Chang Gyoon Lim

分类：人工智能

2022-07-11

车祸（IOV）可以促进连接车辆（CV），自动驾驶汽车（AV）和其他IOV实体之间的无缝连通性。 IOV网络的入侵检测系统（IDS）可以依靠机器学习（ML）来保护车辆内网络免受网络攻击。基于区块链的联合森林（BFF）可用于根据IOV实体的数据训练ML模型，同时保护数据的机密性并降低对数据篡改的风险。但是，以这种方式创建的ML模型仍然容易受到逃避，中毒和探索性攻击的影响。本文研究了各种可能的对抗性示例对BFF-ID的影响。我们提出了整合统计检测器来检测和提取未知的对抗样品。通过将未知检测的样品包括在检测器的数据集中，我们使用附加模型来增强BFF-ID，以检测原始已知攻击和新的对抗性输入。统计对手检测器以50和100个输入样本的样本量确信对对抗性示例。此外，增强的BFF-IDS（BFF-IDS（AUG））成功地减轻了以上96％的精度。通过这种方法，每当检测到对抗样本并随后采用BFF-ID（AUG）作为主动安全模型时，该模型将继续在沙箱中增强。因此，统计对抗检测器的拟议集成以及随后使用检测到的对抗样本对BFF-ID的增强，为对抗性例子和其他未知攻击提供了可持续的安全框架。

translated by 谷歌翻译

Towards an Awareness of Time Series Anomaly Detection Models' Adversarial Vulnerability

Shahroz Tariq , Binh M. Le , Simon S. Woo

分类：机器学习

2022-08-24

时间序列异常检测在统计，经济学和计算机科学中进行了广泛的研究。多年来，使用基于深度学习的方法为时间序列异常检测提出了许多方法。这些方法中的许多方法都在基准数据集上显示了最先进的性能，给人一种错误的印象，即这些系统在许多实用和工业现实世界中都可以强大且可部署。在本文中，我们证明了最先进的异常检测方法的性能通过仅在传感器数据中添加小的对抗扰动来实质性地降解。我们使用不同的评分指标，例如预测错误，异常和分类评分，包括几个公共和私人数据集，从航空航天应用程序，服务器机器到发电厂的网络物理系统。在众所周知的对抗攻击中，来自快速梯度标志方法（FGSM）和预计梯度下降（PGD）方法，我们证明了最新的深神经网络（DNNS）和图形神经网络（GNNS）方法，这些方法声称这些方法是要对异常进行稳健，并且可能已集成在现实生活中，其性能下降到低至0％。据我们最好的理解，我们首次证明了针对对抗攻击的异常检测系统的脆弱性。这项研究的总体目标是提高对时间序列异常检测器的对抗性脆弱性的认识。

translated by 谷歌翻译

FENCE: Feasible Evasion Attacks on Neural Networks in Constrained Environments

Alesia Chernikova , Alina Oprea

分类：机器学习

2019-09-23

随着深度神经网络（DNNS）的进步在许多关键应用中表现出前所未有的性能水平，它们的攻击脆弱性仍然是一个悬而未决的问题。我们考虑在测试时间进行逃避攻击，以防止在受约束的环境中进行深入学习，其中需要满足特征之间的依赖性。这些情况可能自然出现在表格数据中，也可能是特定应用程序域中功能工程的结果，例如网络安全中的威胁检测。我们提出了一个普通的基于迭代梯度的框架，称为围栏，用于制定逃避攻击，考虑到约束域和应用要求的细节。我们将其应用于针对两个网络安全应用培训的前馈神经网络：网络流量僵尸网络分类和恶意域分类，以生成可行的对抗性示例。我们广泛评估了攻击的成功率和绩效，比较它们对几个基线的改进，并分析影响攻击成功率的因素，包括优化目标和数据失衡。我们表明，通过最少的努力（例如，生成12个其他网络连接），攻击者可以将模型的预测从恶意类更改为良性并逃避分类器。我们表明，在具有更高失衡的数据集上训练的模型更容易受到我们的围栏攻击。最后，我们证明了在受限领域进行对抗训练的潜力，以提高针对这些逃避攻击的模型弹性。

translated by 谷歌翻译

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

Weilin Xu , David Evans , Yanjun Qi

分类：

2017-04-04

Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by adversarial examples that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model's prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives.This paper explores two feature squeezing methods: reducing the color bit depth of each pixel and spatial smoothing. These simple strategies are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.

translated by 谷歌翻译

Towards Efficiently Evaluating the Robustness of Deep Neural Networks in IoT Systems: A GAN-based Method

Tao Bai , Jun Zhao , Jinlin Zhu , Shoudong Han , Jiefeng Chen , Bo Li , Alex Kot

分类：机器学习

2021-11-19

基于深度神经网络（DNN）的智能信息（IOT）系统已被广泛部署在现实世界中。然而，发现DNNS易受对抗性示例的影响，这提高了人们对智能物联网系统的可靠性和安全性的担忧。测试和评估IOT系统的稳健性成为必要和必要。最近已经提出了各种攻击和策略，但效率问题仍未纠正。现有方法是计算地广泛或耗时，这在实践中不适用。在本文中，我们提出了一种称为攻击启发GaN（AI-GaN）的新框架，在有条件地产生对抗性实例。曾经接受过培训，可以有效地给予对抗扰动的输入图像和目标类。我们在白盒设置的不同数据集中应用AI-GaN，黑匣子设置和由最先进的防御保护的目标模型。通过广泛的实验，AI-GaN实现了高攻击成功率，优于现有方法，并显着降低了生成时间。此外，首次，AI-GaN成功地缩放到复杂的数据集。 Cifar-100和Imagenet，所有课程中的成功率约为90美元。

translated by 谷歌翻译

RoVISQ: Reduction of Video Service Quality via Adversarial Attacks on Deep Learning-based Video Compression

Jung-Woo Chang , Mojan Javaheripi , Seira Hidano , Farinaz Koushanfar

分类：计算机视觉

2022-03-18

Video compression plays a crucial role in video streaming and classification systems by maximizing the end-user quality of experience (QoE) at a given bandwidth budget. In this paper, we conduct the first systematic study for adversarial attacks on deep learning-based video compression and downstream classification systems. Our attack framework, dubbed RoVISQ, manipulates the Rate-Distortion ($\textit{R}$-$\textit{D}$) relationship of a video compression model to achieve one or both of the following goals: (1) increasing the network bandwidth, (2) degrading the video quality for end-users. We further devise new objectives for targeted and untargeted attacks to a downstream video classification service. Finally, we design an input-invariant perturbation that universally disrupts video compression and classification systems in real time. Unlike previously proposed attacks on video classification, our adversarial perturbations are the first to withstand compression. We empirically show the resilience of RoVISQ attacks against various defenses, i.e., adversarial training, video denoising, and JPEG compression. Our extensive experimental results on various video datasets show RoVISQ attacks deteriorate peak signal-to-noise ratio by up to 5.6dB and the bit-rate by up to $\sim$ 2.4$\times$ while achieving over 90$\%$ attack success rate on a downstream classifier. Our user study further demonstrates the effect of RoVISQ attacks on users' QoE.

translated by 谷歌翻译

Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art

Xiang Ling , Lingfei Wu , Jiangyu Zhang , Zhenqing Qu , Wei Deng , Xiang Chen , Chunming Wu , Shouling Ji , Tianyue Luo , Jingzheng Wu

分类：人工智能

2021-12-23

恶意软件是跨越多个操作系统和各种文件格式的计算机的最损害威胁之一。为了防止不断增长的恶意软件的威胁，已经提出了巨大的努力来提出各种恶意软件检测方法，试图有效和有效地检测恶意软件。最近的研究表明，一方面，现有的ML和DL能够卓越地检测新出现和以前看不见的恶意软件。然而，另一方面，ML和DL模型本质上易于侵犯对抗性示例形式的对抗性攻击，这通过略微仔细地扰乱了合法输入来混淆目标模型来恶意地产生。基本上，在计算机视觉领域最初广泛地研究了对抗性攻击，并且一些快速扩展到其他域，包括NLP，语音识别甚至恶意软件检测。在本文中，我们专注于Windows操作系统系列中的便携式可执行文件（PE）文件格式的恶意软件，即Windows PE恶意软件，作为在这种对抗设置中研究对抗性攻击方法的代表性案例。具体而言，我们首先首先概述基于ML / DL的Windows PE恶意软件检测的一般学习框架，随后突出了在PE恶意软件的上下文中执行对抗性攻击的三个独特挑战。然后，我们进行全面和系统的审查，以对PE恶意软件检测以及增加PE恶意软件检测的稳健性的相应防御，对近最新的对手攻击进行分类。我们首先向Windows PE恶意软件检测的其他相关攻击结束除了对抗对抗攻击之外，然后对未来的研究方向和机遇脱落。

translated by 谷歌翻译

ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models

Pin-Yu Chen , Huan Zhang , Yash Sharma , Jinfeng Yi , Cho-Jui Hsieh

分类：

2017-08-14

Deep neural networks (DNNs) are one of the most prominent technologies of our time, as they achieve state-of-the-art performance in many machine learning tasks, including but not limited to image classification, text mining, and speech processing. However, recent research on DNNs has indicated ever-increasing concern on the robustness to adversarial examples, especially for security-critical tasks such as traffic sign identification for autonomous driving. Studies have unveiled the vulnerability of a well-trained DNN by demonstrating the ability of generating barely noticeable (to both human and machines) adversarial images that lead to misclassification. Furthermore, researchers have shown that these adversarial images are highly transferable by simply training and attacking a substitute model built upon the target model, known as a black-box attack to DNNs.Similar to the setting of training substitute models, in this paper we propose an effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN. However, different from leveraging attack transferability from substitute models, we propose zeroth order optimization (ZOO) based attacks to directly estimate the gradients of the targeted DNN for generating adversarial examples. We use zeroth order stochastic coordinate descent along with dimension reduction, hierarchical attack and importance sampling techniques to * Pin-Yu Chen and Huan Zhang contribute equally to this work.

translated by 谷歌翻译

TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep Neural Network Systems

Bao Gia Doan , Minhui Xue , Shiqing Ma , Ehsan Abbasnejad , Damith C. Ranasinghe

分类：计算机视觉

2021-11-19

深度神经网络容易受到来自对抗性投入的攻击，并且最近，特洛伊木马误解或劫持模型的决定。我们通过探索有界抗逆性示例空间和生成的对抗网络内的自然输入空间来揭示有界面的对抗性实例 - 通用自然主义侵害贴片的兴趣类 - 我们呼叫TNT。现在，一个对手可以用一个自然主义的补丁来手臂自己，不太恶意，身体上可实现，高效 - 实现高攻击成功率和普遍性。 TNT是普遍的，因为在场景中的TNT中捕获的任何输入图像都将：i）误导网络（未确定的攻击）;或ii）迫使网络进行恶意决定（有针对性的攻击）。现在，有趣的是，一个对抗性补丁攻击者有可能发挥更大的控制水平 - 选择一个独立，自然的贴片的能力，与被限制为嘈杂的扰动的触发器 - 到目前为止只有可能与特洛伊木马攻击方法有可能干扰模型建设过程，以嵌入风险发现的后门;但是，仍然意识到在物理世界中部署的补丁。通过对大型视觉分类任务的广泛实验，想象成在其整个验证集50,000张图像中进行评估，我们展示了TNT的现实威胁和攻击的稳健性。我们展示了攻击的概括，以创建比现有最先进的方法实现更高攻击成功率的补丁。我们的结果表明，攻击对不同的视觉分类任务（CIFAR-10，GTSRB，PUBFIG）和多个最先进的深神经网络，如WieredEnet50，Inception-V3和VGG-16。

translated by 谷歌翻译

Careful What You Wish For: on the Extraction of Adversarially Trained Models

Kacem Khaled , Gabriela Nicolescu , Felipe Gohring de Magalhães

分类：机器学习

2022-07-21

最近对机器学习（ML）模型的攻击，例如逃避攻击，具有对抗性示例，并通过提取攻击窃取了一些模型，构成了几种安全性和隐私威胁。先前的工作建议使用对抗性训练从对抗性示例中保护模型，以逃避模型的分类并恶化其性能。但是，这种保护技术会影响模型的决策边界及其预测概率，因此可能会增加模型隐私风险。实际上，仅使用对模型预测输出的查询访问的恶意用户可以提取它并获得高智能和高保真替代模型。为了更大的提取，这些攻击利用了受害者模型的预测概率。实际上，所有先前关于提取攻击的工作都没有考虑到出于安全目的的培训过程中的变化。在本文中，我们提出了一个框架，以评估具有视觉数据集对对抗训练的模型的提取攻击。据我们所知，我们的工作是第一个进行此类评估的工作。通过一项广泛的实证研究，我们证明了受对抗训练的模型比在自然训练情况下获得的模型更容易受到提取攻击的影响。他们可以达到高达$ \ times1.2 $更高的准确性和同意，而疑问低于$ \ times0.75 $。我们还发现，与从自然训练的（即标准）模型中提取的DNN相比，从鲁棒模型中提取的对抗性鲁棒性能力可通过提取攻击（即从鲁棒模型提取的深神经网络（DNN）提取的深神网络（DNN））传递。

translated by 谷歌翻译

An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences

Wei Guo , Benedetta Tondi , Mauro Barni

分类：计算机视觉

2021-11-16

与令人印象深刻的进步触动了我们社会的各个方面，基于深度神经网络（DNN）的AI技术正在带来越来越多的安全问题。虽然在考试时间运行的攻击垄断了研究人员的初始关注，但是通过干扰培训过程来利用破坏DNN模型的可能性，代表了破坏训练过程的可能性，这是破坏AI技术的可靠性的进一步严重威胁。在后门攻击中，攻击者损坏了培训数据，以便在测试时间诱导错误的行为。然而，测试时间误差仅在存在与正确制作的输入样本对应的触发事件的情况下被激活。通过这种方式，损坏的网络继续正常输入的预期工作，并且只有当攻击者决定激活网络内隐藏的后门时，才会发生恶意行为。在过去几年中，后门攻击一直是强烈的研究活动的主题，重点是新的攻击阶段的发展，以及可能对策的提议。此概述文件的目标是审查发表的作品，直到现在，分类到目前为止提出的不同类型的攻击和防御。指导分析的分类基于攻击者对培训过程的控制量，以及防御者验证用于培训的数据的完整性，并监控DNN在培训和测试中的操作时间。因此，拟议的分析特别适合于参考他们在运营的应用方案的攻击和防御的强度和弱点。

translated by 谷歌翻译