如今,深度学习模型的所有者和开发人员必须考虑其培训数据的严格隐私保护规则,通常是人群来源且保留敏感信息。如今,深入学习模型执行隐私保证的最广泛采用的方法依赖于实施差异隐私的优化技术。根据文献,这种方法已被证明是针对多种模型的隐私攻击的成功防御,但其缺点是对模型的性能的实质性降级。在这项工作中,我们比较了差异私有的随机梯度下降(DP-SGD)算法与使用正则化技术的标准优化实践的有效性。我们分析了生成模型的实用程序,培训性能以及成员推理和模型反转攻击对学习模型的有效性。最后,我们讨论了差异隐私的缺陷和限制,并从经验上证明了辍学和L2型规范的卓越保护特性。
translated by 谷歌翻译
Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, , about how much information is leaked by a mechanism. However, implementations of privacy-preserving machine learning often select large values of in order to get acceptable utility of the model, with little understanding of the impact of such choices on meaningful privacy. Moreover, in scenarios where iterative learning procedures are used, differential privacy variants that offer tighter analyses are used which appear to reduce the needed privacy budget but present poorly understood trade-offs between privacy and utility. In this paper, we quantify the impact of these choices on privacy in experiments with logistic regression and neural network models. Our main finding is that there is a huge gap between the upper bounds on privacy loss that can be guaranteed, even with advanced mechanisms, and the effective privacy loss that can be measured using current inference attacks. Current mechanisms for differentially private machine learning rarely offer acceptable utility-privacy trade-offs with guarantees for complex learning tasks: settings that provide limited accuracy loss provide meaningless privacy guarantees, and settings that provide strong privacy guarantees result in useless models.
translated by 谷歌翻译
从公共机器学习(ML)模型中泄漏数据是一个越来越重要的领域,因为ML的商业和政府应用可以利用多个数据源,可能包括用户和客户的敏感数据。我们对几个方面的当代进步进行了全面的调查,涵盖了非自愿数据泄漏,这对ML模型很自然,潜在的恶毒泄漏是由隐私攻击引起的,以及目前可用的防御机制。我们专注于推理时间泄漏,这是公开可用模型的最可能场景。我们首先在不同的数据,任务和模型体系结构的背景下讨论什么是泄漏。然后,我们提出了跨非自愿和恶意泄漏的分类法,可用的防御措施,然后进行当前可用的评估指标和应用。我们以杰出的挑战和开放性的问题结束,概述了一些有希望的未来研究方向。
translated by 谷歌翻译
鉴于对机器学习模型的访问,可以进行对手重建模型的培训数据?这项工作从一个强大的知情对手的镜头研究了这个问题,他们知道除了一个之外的所有培训数据点。通过实例化混凝土攻击,我们表明重建此严格威胁模型中的剩余数据点是可行的。对于凸模型(例如Logistic回归),重建攻击很简单,可以以封闭形式导出。对于更常规的模型(例如神经网络),我们提出了一种基于训练的攻击策略,该攻击策略接收作为输入攻击的模型的权重,并产生目标数据点。我们展示了我们对MNIST和CIFAR-10训练的图像分类器的攻击的有效性,并系统地研究了标准机器学习管道的哪些因素影响重建成功。最后,我们从理论上调查了有多差异的隐私足以通过知情对手减轻重建攻击。我们的工作提供了有效的重建攻击,模型开发人员可以用于评估超出以前作品中考虑的一般设置中的个别点的记忆(例如,生成语言模型或访问培训梯度);它表明,标准模型具有存储足够信息的能力,以实现培训数据点的高保真重建;它表明,差异隐私可以成功减轻该参数制度中的攻击,其中公用事业劣化最小。
translated by 谷歌翻译
机器学习(ML)模型已广泛应用于各种应用,包括图像分类,文本生成,音频识别和图形数据分析。然而,最近的研究表明,ML模型容易受到隶属推导攻击(MIS),其目的是推断数据记录是否用于训练目标模型。 ML模型上的MIA可以直接导致隐私违规行为。例如,通过确定已经用于训练与某种疾病相关的模型的临床记录,攻击者可以推断临床记录的所有者具有很大的机会。近年来,MIS已被证明对各种ML模型有效,例如,分类模型和生成模型。同时,已经提出了许多防御方法来减轻米西亚。虽然ML模型上的MIAS形成了一个新的新兴和快速增长的研究区,但还没有对这一主题进行系统的调查。在本文中,我们对会员推论和防御进行了第一个全面调查。我们根据其特征提供攻击和防御的分类管理,并讨论其优点和缺点。根据本次调查中确定的限制和差距,我们指出了几个未来的未来研究方向,以激发希望遵循该地区的研究人员。这项调查不仅是研究社区的参考,而且还为该研究领域之外的研究人员带来了清晰的照片。为了进一步促进研究人员,我们创建了一个在线资源存储库,并与未来的相关作品继续更新。感兴趣的读者可以在https://github.com/hongshenghu/membership-inference-machine-learning-literature找到存储库。
translated by 谷歌翻译
Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge.We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing stateof-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.
translated by 谷歌翻译
Deep ensemble learning has been shown to improve accuracy by training multiple neural networks and averaging their outputs. Ensemble learning has also been suggested to defend against membership inference attacks that undermine privacy. In this paper, we empirically demonstrate a trade-off between these two goals, namely accuracy and privacy (in terms of membership inference attacks), in deep ensembles. Using a wide range of datasets and model architectures, we show that the effectiveness of membership inference attacks increases when ensembling improves accuracy. We analyze the impact of various factors in deep ensembles and demonstrate the root cause of the trade-off. Then, we evaluate common defenses against membership inference attacks based on regularization and differential privacy. We show that while these defenses can mitigate the effectiveness of membership inference attacks, they simultaneously degrade ensemble accuracy. We illustrate similar trade-off in more advanced and state-of-the-art ensembling techniques, such as snapshot ensembles and diversified ensemble networks. Finally, we propose a simple yet effective defense for deep ensembles to break the trade-off and, consequently, improve the accuracy and privacy, simultaneously.
translated by 谷歌翻译
作为对培训数据隐私的长期威胁,会员推理攻击(MIA)在机器学习模型中无处不在。现有作品证明了培训的区分性与测试损失分布与模型对MIA的脆弱性之间的密切联系。在现有结果的激励下,我们提出了一个基于轻松损失的新型培训框架,并具有更可实现的学习目标,从而导致概括差距狭窄和隐私泄漏减少。 RelaseLoss适用于任何分类模型,具有易于实施和可忽略不计的开销的额外好处。通过对具有不同方式(图像,医疗数据,交易记录)的五个数据集进行广泛的评估,我们的方法始终优于针对MIA和模型效用的韧性,以最先进的防御机制优于最先进的防御机制。我们的防御是第一个可以承受广泛攻击的同时,同时保存(甚至改善)目标模型的效用。源代码可从https://github.com/dingfanchen/relaxloss获得
translated by 谷歌翻译
我们研究在机器学习模型中部署经常性神经网络(RNNS)的隐私含义。我们专注于一类隐私威胁,称为会员推理攻击(MIAS),其旨在推断是否已用于培训模型的特定数据记录。考虑到三台机器学习应用,即机器翻译,深度加固学习和图像分类,我们提供了经验证据,即RNN比偏置更容易受到偏置的偏移架构。然后,我们研究差异隐私方法,以保护RNN的培训数据集的隐私。众所周知,这些方法提供了严谨的隐私,而不管对抗的模型如何保证。我们为所谓的DP-FedAVG算法开发替代差异隐私机制,而不是在训练期间混淆渐变,使模型的输出组合。与现有的工作不同,该机制允许培训隐私参数的后期调整,而无需重新培训模型。我们提供数值结果,表明该机制为米西亚提供了强烈的盾牌,同时交易边际效用。
translated by 谷歌翻译
会员推理攻击是机器学习模型中最简单的隐私泄漏形式之一:给定数据点和模型,确定该点是否用于培训模型。当查询其培训数据时,现有会员推理攻击利用模型的异常置信度。如果对手访问模型的预测标签,则不会申请这些攻击,而不会置信度。在本文中,我们介绍了仅限标签的会员资格推理攻击。我们的攻击而不是依赖置信分数,而是评估模型预测标签在扰动下的稳健性,以获得细粒度的隶属信号。这些扰动包括常见的数据增强或对抗例。我们经验表明,我们的标签占会员推理攻击与先前攻击相符,以便需要访问模型信心。我们进一步证明,仅限标签攻击违反了(隐含或明确)依赖于我们呼叫信心屏蔽的现象的员工推论攻击的多种防御。这些防御修改了模型的置信度分数以挫败攻击,但留下模型的预测标签不变。我们的标签攻击展示了置信性掩蔽不是抵御会员推理的可行的防御策略。最后,我们调查唯一的案例标签攻击,该攻击推断为少量异常值数据点。我们显示仅标签攻击也匹配此设置中基于置信的攻击。我们发现具有差异隐私和(强)L2正则化的培训模型是唯一已知的防御策略,成功地防止所有攻击。即使差异隐私预算太高而无法提供有意义的可证明担保,这仍然存在。
translated by 谷歌翻译
In a membership inference attack, an attacker aims to infer whether a data sample is in a target classifier's training dataset or not. Specifically, given a black-box access to the target classifier, the attacker trains a binary classifier, which takes a data sample's confidence score vector predicted by the target classifier as an input and predicts the data sample to be a member or non-member of the target classifier's training dataset. Membership inference attacks pose severe privacy and security threats to the training dataset. Most existing defenses leverage differential privacy when training the target classifier or regularize the training process of the target classifier. These defenses suffer from two key limitations: 1) they do not have formal utility-loss guarantees of the confidence score vectors, and 2) they achieve suboptimal privacy-utility tradeoffs.In this work, we propose MemGuard, the first defense with formal utility-loss guarantees against black-box membership inference attacks. Instead of tampering the training process of the target classifier, MemGuard adds noise to each confidence score vector predicted by the target classifier. Our key observation is that attacker uses a classifier to predict member or non-member and classifier is vulnerable to adversarial examples. Based on the observation, we propose to add a carefully crafted noise vector to a confidence score vector to turn it into an adversarial example that misleads the attacker's classifier. Specifically, MemGuard works in two phases. In Phase I, MemGuard finds a carefully crafted noise vector that can turn a confidence score vector into an adversarial example, which is likely to mislead the attacker's classifier to make a random guessing at member or non-member. We find such carefully crafted noise vector via a new method that we design to incorporate the unique utility-loss constraints on the noise vector. In Phase II, Mem-Guard adds the noise vector to the confidence score vector with a certain probability, which is selected to satisfy a given utility-loss budget on the confidence score vector. Our experimental results on
translated by 谷歌翻译
梯度泄漏攻击被认为是深度学习中的邪恶隐私威胁之一,因为攻击者在迭代培训期间隐蔽了梯度更新,而不会影响模型培训质量,但又使用泄漏的梯度逐步重建敏感培训数据,具有高攻击成功率。虽然具有差异隐私的深度学习是发布具有差异隐私保障的深度学习模型的违法标准,但我们展示了具有固定隐私参数的差异私有算法易受梯度泄漏攻击的影响。本文调查了差异隐私(DP)的梯度泄漏弹性深度学习的替代方法。首先,我们分析了差异隐私的深度学习的现有实现,它使用固定噪声方差使用固定隐私参数将恒定噪声对所有层中的梯度注入恒定噪声。尽管提供了DP保证,但该方法遭受了低精度,并且很容易受到梯度泄漏攻击。其次,通过使用动态隐私参数,我们提出了一种梯度泄漏弹性深度学习方法,差异隐私保证。与导致恒定噪声方差导致的固定参数策略不同,不同的动态参数策略存在替代技术,以引入自适应噪声方差和自适应噪声注入,其与差别私有模型训练期间的梯度更新的趋势紧密对齐。最后,我们描述了四个互补指标来评估和比较替代方法。
translated by 谷歌翻译
在培训机器学习模型期间,它们可能会存储或“了解”有关培训数据的更多信息,而不是预测或分类任务所需的信息。属性推理攻击旨在从给定模型的培训数据中提取统计属性,而无需访问培训数据本身,从而利用了这一点。这些属性可能包括图片的质量,以识别相机模型,以揭示产品的目标受众的年龄分布或在计算机网络中使用恶意软件攻击的随附的主机类型。当攻击者可以访问所有模型参数时,即在白色盒子方案中,此攻击尤其准确。通过捍卫此类攻击,模型所有者可以确保其培训数据,相关的属性以及其知识产权保持私密,即使他们故意共享自己的模型,例如协作培训或模型泄漏。在本文中,我们介绍了属性,这是针对白盒属性推理攻击的有效防御机制,独立于培训数据类型,模型任务或属性数量。属性通过系统地更改目标模型的训练的权重和偏见来减轻属性推理攻击,从而使对手无法提取所选属性。我们在三个不同的数据集(包括表格数据和图像数据)以及两种类型的人工神经网络(包括人造神经网络)上进行了经验评估属性。我们的研究结果表明,以良好的隐私性权衡取舍,可以保护机器学习模型免受财产推理攻击的侵害,既有效又可靠。此外,我们的方法表明该机制也有效地取消了多个特性。
translated by 谷歌翻译
我们审查在机器学习(ML)中使用差异隐私(DP)对隐私保护的使用。我们表明,在维护学习模型的准确性的驱动下,基于DP的ML实现非常宽松,以至于它们不提供DP的事前隐私保证。取而代之的是,他们提供的基本上是与传统(经常受到批评的)统计披露控制方法相似的噪声。由于缺乏正式的隐私保证,因此所提供的实际隐私水平必须经过实验评估,这很少进行。在这方面,我们提出的经验结果表明,ML中的标准反拟合技术可以比DP实现更好的实用性/隐私/效率权衡。
translated by 谷歌翻译
机器学习模型容易记住敏感数据,使它们容易受到会员推理攻击的攻击,其中对手的目的是推断是否使用输入样本来训练模型。在过去的几年中,研究人员产生了许多会员推理攻击和防御。但是,这些攻击和防御采用各种策略,并在不同的模型和数据集中进行。但是,缺乏全面的基准意味着我们不了解现有攻击和防御的优势和劣势。我们通过对不同的会员推理攻击和防御措施进行大规模测量来填补这一空白。我们通过研究九项攻击和六项防御措施来系统化成员的推断,并在整体评估中衡量不同攻击和防御的性能。然后,我们量化威胁模型对这些攻击结果的影响。我们发现,威胁模型的某些假设,例如相同架构和阴影和目标模型之间的相同分布是不必要的。我们也是第一个对从Internet收集的现实世界数据而不是实验室数据集进行攻击的人。我们进一步研究是什么决定了会员推理攻击的表现,并揭示了通常认为过度拟合水平不足以成功攻击。取而代之的是,成员和非成员样本之间的熵/横向熵的詹森 - 香农距离与攻击性能的相关性更好。这为我们提供了一种新的方法,可以在不进行攻击的情况下准确预测会员推理风险。最后,我们发现数据增强在更大程度上降低了现有攻击的性能,我们提出了使用增强作用的自适应攻击来训练阴影和攻击模型,以改善攻击性能。
translated by 谷歌翻译
Deep Learning has recently become hugely popular in machine learning for its ability to solve end-to-end learning systems, in which the features and the classifiers are learned simultaneously, providing significant improvements in classification accuracy in the presence of highly-structured and large databases.Its success is due to a combination of recent algorithmic breakthroughs, increasingly powerful computers, and access to significant amounts of data.Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15.Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level differential privacy applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).
translated by 谷歌翻译
联合学习允许一组用户在私人训练数据集中培训深度神经网络。在协议期间,数据集永远不会留下各个用户的设备。这是通过要求每个用户向中央服务器发送“仅”模型更新来实现,从而汇总它们以更新深神经网络的参数。然而,已经表明,每个模型更新都具有关于用户数据集的敏感信息(例如,梯度反转攻击)。联合学习的最先进的实现通过利用安全聚合来保护这些模型更新:安全监控协议,用于安全地计算用户的模型更新的聚合。安全聚合是关键,以保护用户的隐私,因为它会阻碍服务器学习用户提供的个人模型更新的源,防止推断和数据归因攻击。在这项工作中,我们表明恶意服务器可以轻松地阐明安全聚合,就像后者未到位一样。我们设计了两种不同的攻击,能够在参与安全聚合的用户数量上,独立于参与安全聚合的用户数。这使得它们在大规模现实世界联邦学习应用中的具体威胁。攻击是通用的,不瞄准任何特定的安全聚合协议。即使安全聚合协议被其理想功能替换为提供完美的安全性的理想功能,它们也同样有效。我们的工作表明,安全聚合与联合学习相结合,当前实施只提供了“虚假的安全感”。
translated by 谷歌翻译
近年来,机器学习模型对成员推论攻击的脆弱性受到了很大的关注。然而,由于具有高假阳性率,现有攻击大多是不切实际的,其中非成员样本通常被错误地预测为成员。这种类型的错误使得预测的隶属信号不可靠,特别是因为大多数样本都是现实世界应用中的非成员。在这项工作中,我们认为会员推理攻击可以从\ emph {难度校准}剧烈地利用,其中调整攻击的预测会员评分以正确分类目标样本的难度。我们表明,在没有准确性的情况下,难度校准可以显着降低各种现有攻击的假阳性率。
translated by 谷歌翻译
Machine learning models leak information about the datasets on which they are trained. An adversary can build an algorithm to trace the individual members of a model's training dataset. As a fundamental inference attack, he aims to distinguish between data points that were part of the model's training set and any other data points from the same distribution. This is known as the tracing (and also membership inference) attack. In this paper, we focus on such attacks against black-box models, where the adversary can only observe the output of the model, but not its parameters. This is the current setting of machine learning as a service in the Internet.We introduce a privacy mechanism to train machine learning models that provably achieve membership privacy: the model's predictions on its training data are indistinguishable from its predictions on other data points from the same distribution. We design a strategic mechanism where the privacy mechanism anticipates the membership inference attacks. The objective is to train a model such that not only does it have the minimum prediction error (high utility), but also it is the most robust model against its corresponding strongest inference attack (high privacy). We formalize this as a min-max game optimization problem, and design an adversarial training algorithm that minimizes the classification loss of the model as well as the maximum gain of the membership inference attack against it. This strategy, which guarantees membership privacy (as prediction indistinguishability), acts also as a strong regularizer and significantly generalizes the model.We evaluate our privacy mechanism on deep neural networks using different benchmark datasets. We show that our min-max strategy can mitigate the risk of membership inference attacks (close to the random guess) with a negligible cost in terms of the classification error.
translated by 谷歌翻译
Machine learning (ML) has become a core component of many real-world applications and training data is a key factor that drives current progress. This huge success has led Internet companies to deploy machine learning as a service (MLaaS). Recently, the first membership inference attack has shown that extraction of information on the training set is possible in such MLaaS settings, which has severe security and privacy implications.However, the early demonstrations of the feasibility of such attacks have many assumptions on the adversary, such as using multiple so-called shadow models, knowledge of the target model structure, and having a dataset from the same distribution as the target model's training data. We relax all these key assumptions, thereby showing that such attacks are very broadly applicable at low cost and thereby pose a more severe risk than previously thought. We present the most comprehensive study so far on this emerging and developing threat using eight diverse datasets which show the viability of the proposed attacks across domains.In addition, we propose the first effective defense mechanisms against such broader class of membership inference attacks that maintain a high level of utility of the ML model.
translated by 谷歌翻译