There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet. UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models. UEs typically are generated via a bilevel optimization framework with a surrogate model to remove (minimize) errors from the original samples, and then applied to protect the data against unknown target models. However, existing UE generation methods all rely on an ideal assumption called label-consistency, where the hackers and protectors are assumed to hold the same label for a given sample. In this work, we propose and promote a more practical label-agnostic setting, where the hackers may exploit the protected data quite differently from the protectors. E.g., a m-class unlearnable dataset held by the protector may be exploited by the hacker as a n-class dataset. Existing UE generation methods are rendered ineffective in this challenging setting. To tackle this challenge, we present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations. Furthermore, we propose to leverage VisionandLanguage Pre-trained Models (VLPMs) like CLIP as the surrogate model to improve the transferability of the crafted UCs to diverse domains. We empirically verify the effectiveness of our proposed approach under a variety of settings with different datasets, target models, and even commercial platforms Microsoft Azure and Baidu PaddlePaddle.
translated by 谷歌翻译
The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.
translated by 谷歌翻译
视觉语言预训练(VLP)模型已在众多跨模式任务中实现了最先进的性能。由于它们经过优化以捕获内模性内和模式间的统计特性,因此仍然存在学习数据中提出的社会偏见的风险。在这项工作中,我们(1)通过比较了事实和反事实样本的[掩码] ED预测概率,引入了基于反事实的偏见测量\ emph {contrbias},以量化VLP模型中的社交偏见; (2)构建一个新型的VL偏置数据集,其中包括24K图像文本对,用于测量VLP模型中的性别偏见,我们从中观察到在VLP模型中普遍存在显着的性别偏见; (3)提出了一种VLP偏见方法\ emph {fairvlp},以最大程度地减少用于VLP DEBIASing的事实和反事实图像文本对之间的[掩码] ED预测概率的差异。尽管Cunderbias和FaiRVLP专注于社交偏见,但它们可以作为工具,并提供新的见解来探究和正规化VLP模型中的知识。
translated by 谷歌翻译
已经观察到,未经授权使用面部识别系统会引发隐私问题。使用对抗扰动提供了一种解决此问题的可能解决方案。利用对抗未经授权的面部识别系统的对抗性扰动的一个关键问题是:上传到网络上的图像需要通过JPEG压缩处理,这削弱了对抗性扰动的有效性。现有的JPEG压缩方法无法在压缩性,转移性和攻击效果之间达到平衡。为此,我们提出了一种更自然的解决方案,称为低频对抗扰动(LFAP)。我们不必限制对抗性扰动,而是将源模型正规化,以通过对抗训练采用更多的低频功能。此外,为了更好地影响不同的频率组件中的模型,我们提出了以中等频率成分为生产补充的精制低中间频率对抗扰动(LMFAP)。我们在本研究中设计了各种设置,以模拟现实世界的应用程序方案,包括交叉骨架,监管头,培训数据集和测试数据集。定量和定性实验结果验证了拟议溶液的有效性。
translated by 谷歌翻译
尽管视觉语言预训练模型(VLP)显示了各种视觉语言(V+L)任务的革命性改进,但有关其对抗性鲁棒性的研究仍未得到探索。本文研究了对流行VLP模型和V+L任务的对抗性攻击。首先,我们分析了不同设置下对抗性攻击的性能。通过检查不同扰动对象和攻击目标的影响,我们得出了一些关键观察,作为设计强大的多模式对抗攻击和构建强大的VLP模型的指导。其次,我们对称为协作多模式对抗攻击(共攻击)的VLP模型提出了一种新颖的多模式攻击方法,该模型集体对图像模式和文本模式进行了攻击。实验结果表明,所提出的方法可以改善对不同V+L下游任务和VLP模型的攻击性能。分析观察和新颖的攻击方法有望为VLP模型的对抗性鲁棒性提供新的理解,从而在更真实的情况下为他们的安全和可靠的部署做出贡献。
translated by 谷歌翻译
CNN表现出与人类不同的许多行为,其中之一是采用高频组件的能力。本文讨论了图像分类任务中的频率偏差现象:高频组件实际上比低频和中频组件的利用要少得多。我们首先通过提出有关特征歧视和学习优先级的两个观察结果来研究频率偏差现象。此外,我们假设(i)光谱密度,(ii)类一致性直接影响频率偏差。具体而言,我们的研究验证数据集的光谱密度主要影响学习优先级,而课程一致性主要影响特征歧视。
translated by 谷歌翻译
合成伪样品当前是解决广义零局学习(GZSL)问题的最有效方法。大多数模型都达到了竞争性能,但仍然遇到两个问题:(1)功能令人困惑,整体表示混淆了与任务相关和与任务无关的功能,并且现有模型以生成的方式将它们分解,但是它们是不合理的,无法合成可靠的可靠伪样品样本样品有限; (2)分布不确定性,当现有模型合成不确定分布的样本时,需要大量数据,这在有限的可见类样品中导致性能差。在本文中,我们提出了一个非生成模型,以在两个模块中相应地解决这些问题:(1)与任务相关的功能分离,将任务相关的功能从任务无关的功能中排除,通过对域的对抗性学习域对合理合成的适应性; (2)可控的伪样品合成,以合成具有某些特征的边缘伪钉和中心假样品,以产生更多的多样性和直观的传递。此外,为了描述在培训过程中看到的限制类样本的新场景,我们进一步制定了一个新的ZSL任务,名为“几乎看不见的类别和零射门的唯一类别学习”(FSZU)(FSZU)。对四个基准测试的广泛实验验证了所提出的方法在GZSL和FSZU任务中具有竞争力。
translated by 谷歌翻译
深度网络模型在配送(ID)数据上卓越地表现,但可以显着失败,在分销(OOD)数据上。虽然开发方法专注于改善ood泛化,但已经有很少的注意力来评估模型以处理ood数据的能力。本研究致力于分析实验ID试验和设计ood试验范式的问题,以准确评估实际性能。我们的分析基于引入的三种类型的分配转移来基于为生成ood数据进行分类。主要观察包括:(1)ID测试既不反映单个型号的实际性能也没有比较OOD数据下的不同模型。 (2)ID试验失败可以归因于所学到的边际和有条件的杂散相关性来自相应的分布换档。基于此,我们提出了新的OOD测试范式来评估模型的概念化能力,以说明数据,并讨论如何使用OCT测试结果来查找模型的错误以指导模型调试。
translated by 谷歌翻译
尽管在许多领域都有成功的应用,但如今的机器学习模型遭受了臭名昭著的问题,例如脆弱性,对对抗性例子。除了陷入对抗攻击和防御之间的猫与小鼠游戏之外,本文还提供了替代观点来考虑对抗性示例,并探索我们是否可以在良性应用中利用它。我们首先将对抗性示例归因于使用非语义特征的人类模型差异。尽管在经典的机器学习机制中很大程度上被忽略了,但非语义功能具有三个有趣的特征,因为(1)模型独有,(2)对推理至关重要,以及(3)可利用的功能。受到这一点的启发,我们提出了良性的对抗性攻击的新想法,以利用三个方向的对抗性示例以善良:(1)对抗性图灵测试,(2)拒绝恶意模型应用,以及(3)对抗性数据扩增。每个方向都以动机详细说明,理由分析和原型应用来展示其潜力。
translated by 谷歌翻译
ImageNet pre-training has enabled state-of-the-art results on many tasks. In spite of its recognized contribution to generalization, we observed in this study that ImageNet pre-training also transfers adversarial non-robustness from pre-trained model into fine-tuned model in the downstream classification tasks. We first conducted experiments on various datasets and network backbones to uncover the adversarial non-robustness in fine-tuned model. Further analysis was conducted on examining the learned knowledge of fine-tuned model and standard model, and revealed that the reason leading to the non-robustness is the non-robust features transferred from ImageNet pre-trained model. Finally, we analyzed the preference for feature learning of the pre-trained model, explored the factors influencing robustness, and introduced a simple robust ImageNet pre-training solution. Our code is available at \url{https://github.com/jiamingzhang94/ImageNet-Pretraining-transfers-non-robustness}.
translated by 谷歌翻译