对抗商业黑匣子语音平台的对抗攻击,包括云语音API和语音控制设备,直到近年来接受了很少的关注。目前的“黑匣子”攻击所有严重依赖于预测/置信度评分的知识,以加工有效的对抗示例,这可以通过服务提供商直观地捍卫,而不返回这些消息。在本文中,我们提出了在更实用和严格的情况下提出了两种新的对抗攻击。对于商业云演讲API,我们提出了一个决定的黑匣子逆势攻击,这些攻击是唯一的最终决定。在偶变中,我们将决策的AE发电作为一个不连续的大规模全局优化问题,并通过自适应地将该复杂问题自适应地分解成一组子问题并协同优化每个问题来解决它。我们的春天是一种齐全的所有方法,它在一个广泛的流行语音和扬声器识别API,包括谷歌,阿里巴巴,微软,腾讯,达到100%的攻击攻击速度100%的攻击率。 iflytek,和景东,表现出最先进的黑箱攻击。对于商业语音控制设备,我们提出了Ni-Occam,第一个非交互式物理对手攻击,而对手不需要查询Oracle并且无法访问其内部信息和培训数据。我们将对抗性攻击与模型反演攻击相结合,从而产生具有高可转换性的物理有效的音频AE,而无需与目标设备的任何交互。我们的实验结果表明,NI-Occam可以成功欺骗苹果Siri,Microsoft Cortana,Google Assistant,Iflytek和Amazon Echo,平均SRO为52%和SNR为9.65dB,对抗语音控制设备的非交互式物理攻击。
translated by 谷歌翻译
深度学习技术的发展极大地促进了自动语音识别(ASR)技术的性能提高,该技术证明了在许多任务中与人类听力相当的能力。语音接口正变得越来越广泛地用作许多应用程序和智能设备的输入。但是,现有的研究表明,DNN很容易受到轻微干扰的干扰,并且会出现错误的识别,这对于由声音控制的智能语音应用非常危险。
translated by 谷歌翻译
随着硬件和算法的开发,ASR(自动语音识别)系统发展了很多。随着模型变得越来越简单,开发和部署的困难变得更加容易,ASR系统正越来越接近我们的生活。一方面,我们经常使用ASR的应用程序或API来生成字幕和记录会议。另一方面,智能扬声器和自动驾驶汽车依靠ASR系统来控制Aiot设备。在过去的几年中,对ASR系统的攻击攻击有很多作品。通过在波形中添加小的扰动,识别结果有很大的不同。在本文中,我们描述了ASR系统的发展,攻击的不同假设以及如何评估这些攻击。接下来,我们在两个攻击假设中介绍了有关对抗性示例攻击的当前作品:白框攻击和黑框攻击。与其他调查不同,我们更多地关注它们在ASR系统中扰动波形,这些攻击之间的关系及其实现方法之间的层。我们专注于他们作品的效果。
translated by 谷歌翻译
发言人识别系统(SRSS)最近被证明容易受到对抗攻击的影响,从而引发了重大的安全问题。在这项工作中,我们系统地研究了基于确保SRSS的基于对抗性训练的防御。根据SRSS的特征,我们提出了22种不同的转换,并使用扬声器识别的7种最新有前途的对抗攻击(4个白盒和3个Black-Box)对其进行了彻底评估。仔细考虑了国防评估中的最佳实践,我们分析了转换的强度以承受适应性攻击。我们还评估并理解它们与对抗训练相结合的自适应攻击的有效性。我们的研究提供了许多有用的见解和发现,其中许多与图像和语音识别域中的结论是新的或不一致的,例如,可变和恒定的比特率语音压缩具有不同的性能,并且某些不可差的转换仍然有效地抗衡。当前有希望的逃避技术通常在图像域中很好地工作。我们证明,与完整的白色盒子设置中的唯一对抗性训练相比,提出的新型功能级转换与对抗训练相比是相当有效的,例如,将准确性提高了13.62%,而攻击成本则达到了两个数量级,而其他攻击成本则增加了。转型不一定会提高整体防御能力。这项工作进一步阐明了该领域的研究方向。我们还发布了我们的评估平台SpeakerGuard,以促进进一步的研究。
translated by 谷歌翻译
最近的工作阐明了说话者识别系统(SRSS)针对对抗性攻击的脆弱性,从而在部署SRSS时引起了严重的安全问题。但是,他们仅考虑了一些设置(例如,来源和目标扬声器的某些组合),仅在现实世界攻击方案中留下了许多有趣而重要的环境。在这项工作中,我们介绍了AS2T,这是该域中的第一次攻击,该域涵盖了所有设置,因此,对手可以使用任意源和目标扬声器来制作对抗性声音,并执行三个主要识别任务中的任何一种。由于现有的损失功能都不能应用于所有设置,因此我们探索了每种设置的许多候选损失功能,包括现有和新设计的损失功能。我们彻底评估了它们的功效,并发现某些现有的损失功能是次优的。然后,为了提高AS2T对实用的无线攻击的鲁棒性,我们研究了可能发生的扭曲发生在空中传输中,利用具有不同参数的不同转换功能来对这些扭曲进行建模,并将其整合到生成中对手的声音。我们的模拟无线评估验证了解决方案在产生强大的对抗声音方面的有效性,这些声音在各种硬件设备和各种声音环境下保持有效,具有不同的混响,环境噪声和噪声水平。最后,我们利用AS2T来执行迄今为止最大的评估,以了解14个不同SRSS之间的可转移性。可传递性分析提供了许多有趣且有用的见解,这些见解挑战了图像域中先前作品中得出的几个发现和结论。我们的研究还阐明了说话者识别域中对抗攻击的未来方向。
translated by 谷歌翻译
由于使用语音处理系统(VPS)在日常生活中继续变得更加普遍,通过增加商业语音识别设备等应用以及主要文本到语音软件,因此对这些系统的攻击越来越复杂,各种各样的,不断发展。随着VPS的用例快速发展到新的空间和目的,对隐私的潜在后果越来越危险。此外,不断增长的数量和越来越多的空中攻击的实用性使系统失败更可能。在本文中,我们将识别和分类对语音处理系统的独特攻击的安排。多年来,研究已经从专业,未标准的攻击中迁移,导致系统的故障以及拒绝服务更加普遍的目标攻击,这些攻击可以强迫对手控制的结果。目前和最常用的机器学习系统和深神经网络在现代语音处理系统的核心内部建立,重点是性能和可扩展性而不是安全性。因此,我们对我们来重新评估发展语音处理景观并确定当前攻击和防御的状态,以便我们可能会建议未来的发展和理论改进。
translated by 谷歌翻译
Faced with the threat of identity leakage during voice data publishing, users are engaged in a privacy-utility dilemma when enjoying convenient voice services. Existing studies employ direct modification or text-based re-synthesis to de-identify users' voices, but resulting in inconsistent audibility in the presence of human participants. In this paper, we propose a voice de-identification system, which uses adversarial examples to balance the privacy and utility of voice services. Instead of typical additive examples inducing perceivable distortions, we design a novel convolutional adversarial example that modulates perturbations into real-world room impulse responses. Benefit from this, our system could preserve user identity from exposure by Automatic Speaker Identification (ASI) while remaining the voice perceptual quality for non-intrusive de-identification. Moreover, our system learns a compact speaker distribution through a conditional variational auto-encoder to sample diverse target embeddings on demand. Combining diverse target generation and input-specific perturbation construction, our system enables any-to-any identify transformation for adaptive de-identification. Experimental results show that our system could achieve 98% and 79% successful de-identification on mainstream ASIs and commercial systems with an objective Mel cepstral distortion of 4.31dB and a subjective mean opinion score of 4.48.
translated by 谷歌翻译
Speech-centric machine learning systems have revolutionized many leading domains ranging from transportation and healthcare to education and defense, profoundly changing how people live, work, and interact with each other. However, recent studies have demonstrated that many speech-centric ML systems may need to be considered more trustworthy for broader deployment. Specifically, concerns over privacy breaches, discriminating performance, and vulnerability to adversarial attacks have all been discovered in ML research fields. In order to address the above challenges and risks, a significant number of efforts have been made to ensure these ML systems are trustworthy, especially private, safe, and fair. In this paper, we conduct the first comprehensive survey on speech-centric trustworthy ML topics related to privacy, safety, and fairness. In addition to serving as a summary report for the research community, we point out several promising future research directions to inspire the researchers who wish to explore further in this area.
translated by 谷歌翻译
In this paper, we propose dictionary attacks against speaker verification - a novel attack vector that aims to match a large fraction of speaker population by chance. We introduce a generic formulation of the attack that can be used with various speech representations and threat models. The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population. The resulting master voice successfully matches a non-trivial fraction of people in an unknown population. Adversarial waveforms obtained with our approach can match on average 69% of females and 38% of males enrolled in the target system at a strict decision threshold calibrated to yield false alarm rate of 1%. By using the attack with a black-box voice cloning system, we obtain master voices that are effective in the most challenging conditions and transferable between speaker encoders. We also show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.
translated by 谷歌翻译
Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with restricted perturbations are ineffective against defenses such as denoising or adversarial training. In this paper, we focus on unrestricted perturbations and propose StyleFool, a black-box video adversarial attack via style transfer to fool the video classification system. StyleFool first utilizes color theme proximity to select the best style image, which helps avoid unnatural details in the stylized videos. Meanwhile, the target class confidence is additionally considered in targeted attacks to influence the output distribution of the classifier by moving the stylized video closer to or even across the decision boundary. A gradient-free method is then employed to further optimize the adversarial perturbations. We carry out extensive experiments to evaluate StyleFool on two standard datasets, UCF-101 and HMDB-51. The experimental results demonstrate that StyleFool outperforms the state-of-the-art adversarial attacks in terms of both the number of queries and the robustness against existing defenses. Moreover, 50% of the stylized videos in untargeted attacks do not need any query since they can already fool the video classification model. Furthermore, we evaluate the indistinguishability through a user study to show that the adversarial samples of StyleFool look imperceptible to human eyes, despite unrestricted perturbations.
translated by 谷歌翻译
机器学习模型严重易于来自对抗性示例的逃避攻击。通常,对逆势示例的修改输入类似于原始输入的修改输入,在WhiteBox设置下由对手的WhiteBox设置构成,完全访问模型。然而,最近的攻击已经显示出使用BlackBox攻击的对逆势示例的查询号显着减少。特别是,警报是从越来越多的机器学习提供的经过培训的模型的访问界面中利用分类决定作为包括Google,Microsoft,IBM的服务提供商,并由包含这些模型的多种应用程序使用的服务提供商来利用培训的模型。对手仅利用来自模型的预测标签的能力被区别为基于决策的攻击。在我们的研究中,我们首先深入潜入最近的ICLR和SP的最先进的决策攻击,以突出发现低失真对抗采用梯度估计方法的昂贵性质。我们开发了一种强大的查询高效攻击,能够避免在梯度估计方法中看到的嘈杂渐变中的局部最小和误导中的截留。我们提出的攻击方法,ramboattack利用随机块坐标下降的概念来探索隐藏的分类器歧管,针对扰动来操纵局部输入功能以解决梯度估计方法的问题。重要的是,ramboattack对对对手和目标类别可用的不同样本输入更加强大。总的来说,对于给定的目标类,ramboattack被证明在实现给定查询预算的较低失真时更加强大。我们使用大规模的高分辨率ImageNet数据集来策划我们的广泛结果,并在GitHub上开源我们的攻击,测试样本和伪影。
translated by 谷歌翻译
Current neural network-based classifiers are susceptible to adversarial examples even in the black-box setting, where the attacker only has query access to the model. In practice, the threat model for real-world systems is often more restrictive than the typical black-box model where the adversary can observe the full output of the network on arbitrarily many chosen inputs. We define three realistic threat models that more accurately characterize many real-world classifiers: the query-limited setting, the partialinformation setting, and the label-only setting. We develop new attacks that fool classifiers under these more restrictive threat models, where previous methods would be impractical or ineffective. We demonstrate that our methods are effective against an ImageNet classifier under our proposed threat models. We also demonstrate a targeted black-box attack against a commercial classifier, overcoming the challenges of limited query access, partial information, and other practical issues to break the Google Cloud Vision API.
translated by 谷歌翻译
窃取对受控信息的攻击,以及越来越多的信息泄漏事件,已成为近年来新兴网络安全威胁。由于蓬勃发展和部署先进的分析解决方案,新颖的窃取攻击利用机器学习(ML)算法来实现高成功率并导致大量损坏。检测和捍卫这种攻击是挑战性和紧迫的,因此政府,组织和个人应该非常重视基于ML的窃取攻击。本调查显示了这种新型攻击和相应对策的最新进展。以三类目标受控信息的视角审查了基于ML的窃取攻击,包括受控用户活动,受控ML模型相关信息和受控认证信息。最近的出版物总结了概括了总体攻击方法,并导出了基于ML的窃取攻击的限制和未来方向。此外,提出了从三个方面制定有效保护的对策 - 检测,破坏和隔离。
translated by 谷歌翻译
最近的研究突出了对基于神经网络(DNN)的语音识别系统的无处不在的威胁。在这项工作中,我们介绍了基于U-Net的注意力模型,U-Net $ _ {at} $,以增强对抗性语音信号。具体而言,我们通过可解释的语音识别指标评估模型性能,并通过增强的对抗性培训讨论模型性能。我们的实验表明,我们提出的U-Net $ _ {AT} $将言语质量(PESQ)的感知评估从0.65到0.75,短期客观可懂度(STOI)从0.65左右提高了言语质量(PESQ),语音传输指数(STI)对对抗性言语例子的语音增强任务0.83至0.96。我们对具有对冲音频攻击的自动语音识别(ASR)任务进行实验。我们发现(i)注意网络学习的时间特征能够增强基于DNN的ASR模型的鲁棒性; (ii)通过使用添加剂对抗性数据增强施用对抗性训练,可以提高基于DNN基于ASR模型的泛化力。 Word-Error-Rates(WERS)的ASR度量标准表明,基于梯度的扰动下存在绝对的2.22 $ \%$减少,并且在进化优化的扰动下,绝对2.03 $ \%$减少,这表明我们的具有对抗性培训的增强模型可以进一步保护弹性ASR系统。
translated by 谷歌翻译
自动语音识别(ASR)系统普遍存在,特别是在国内电器语音导航和语音控制的应用中。 ASR的计算核心是已被证明易于对抗性扰动的深神经网络(DNN);容易被攻击者滥用生成恶意输出。为了帮助测试ASR的正确性,我们提出了自动生成BlackBox(无关的DNN)的技术,跨ASR可移植的未标准的对抗性攻击。在对冲ASR测试的大部分工作中侧重于针对目标攻击,即给定输出文本生成音频样本。目标技术不可移植,定制到特定ASR内的DNN(白箱)的结构。相比之下,我们的方法攻击在大多数ASR中共享的ASR管道的信号处理阶段。另外,我们确保通过使用维持人类感知阈值低于人类感知阈值的信号来操纵声学信号,确保产生的对抗性音频样本没有人类的声音差异。我们使用三个流行的ASR和三个输入音频数据集使用输出文本的指标来评估我们技术的可移植性和有效性,以及不同ASR上的原始音频的相似性和攻击成功率。我们发现我们的测试技术是跨ASR的便携式携带的,并具有对原始音频的高成功率,WERS和相似性的对抗的音频样本。
translated by 谷歌翻译
深度神经网络容易受到来自对抗性投入的攻击,并且最近,特洛伊木马误解或劫持模型的决定。我们通过探索有界抗逆性示例空间和生成的对抗网络内的自然输入空间来揭示有界面的对抗性实例 - 通用自然主义侵害贴片的兴趣类 - 我们呼叫TNT。现在,一个对手可以用一个自然主义的补丁来手臂自己,不太恶意,身体上可实现,高效 - 实现高攻击成功率和普遍性。 TNT是普遍的,因为在场景中的TNT中捕获的任何输入图像都将:i)误导网络(未确定的攻击);或ii)迫使网络进行恶意决定(有针对性的攻击)。现在,有趣的是,一个对抗性补丁攻击者有可能发挥更大的控制水平 - 选择一个独立,自然的贴片的能力,与被限制为嘈杂的扰动的触发器 - 到目前为止只有可能与特洛伊木马攻击方法有可能干扰模型建设过程,以嵌入风险发现的后门;但是,仍然意识到在物理世界中部署的补丁。通过对大型视觉分类任务的广泛实验,想象成在其整个验证集50,000张图像中进行评估,我们展示了TNT的现实威胁和攻击的稳健性。我们展示了攻击的概括,以创建比现有最先进的方法实现更高攻击成功率的补丁。我们的结果表明,攻击对不同的视觉分类任务(CIFAR-10,GTSRB,PUBFIG)和多个最先进的深神经网络,如WieredEnet50,Inception-V3和VGG-16。
translated by 谷歌翻译
在本文中,我们评估了基于对抗示例的深度学习的AED系统。我们测试多个安全性关键任务的稳健性,实现为CNNS分类器,以及由Google制造的现有第三方嵌套设备,该模型运行自己的黑盒深度学习模型。我们的对抗示例使用由白色和背景噪声制成的音频扰动。这种干扰易于创建,以执行和再现,并且可以访问大量潜在的攻击者,甚至是非技术精明的攻击者。我们表明,对手可以专注于音频对抗性投入,使AED系统分类,即使我们使用少量给定类型的嘈杂干扰,也能实现高成功率。例如,在枪声课堂的情况下,我们在采用少于0.05白噪声水平时达到近100%的成功率。类似于以前通过工作的工作侧重于来自图像域以及语音识别域的对抗示例。然后,我们寻求通过对策提高分类器的鲁棒性。我们雇用了对抗性培训和音频去噪。我们表明,当应用于音频输入时,这些对策可以是分离或组合的,在攻击时,可以成功地产生近50%的近50%。
translated by 谷歌翻译
With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks have been recently found vulnerable to well-designed input samples, called adversarial examples. Adversarial perturbations are imperceptible to human but can easily fool deep neural networks in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying deep neural networks in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for deep neural networks, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples. In addition, three major challenges in adversarial examples and the potential solutions are discussed.
translated by 谷歌翻译
We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.
translated by 谷歌翻译
Video compression plays a crucial role in video streaming and classification systems by maximizing the end-user quality of experience (QoE) at a given bandwidth budget. In this paper, we conduct the first systematic study for adversarial attacks on deep learning-based video compression and downstream classification systems. Our attack framework, dubbed RoVISQ, manipulates the Rate-Distortion ($\textit{R}$-$\textit{D}$) relationship of a video compression model to achieve one or both of the following goals: (1) increasing the network bandwidth, (2) degrading the video quality for end-users. We further devise new objectives for targeted and untargeted attacks to a downstream video classification service. Finally, we design an input-invariant perturbation that universally disrupts video compression and classification systems in real time. Unlike previously proposed attacks on video classification, our adversarial perturbations are the first to withstand compression. We empirically show the resilience of RoVISQ attacks against various defenses, i.e., adversarial training, video denoising, and JPEG compression. Our extensive experimental results on various video datasets show RoVISQ attacks deteriorate peak signal-to-noise ratio by up to 5.6dB and the bit-rate by up to $\sim$ 2.4$\times$ while achieving over 90$\%$ attack success rate on a downstream classifier. Our user study further demonstrates the effect of RoVISQ attacks on users' QoE.
translated by 谷歌翻译