现代机器学习系统在大型数据集中培训时取得了巨大的成功。但是,这些数据集通常包含敏感信息(例如医疗记录,面部图像),导致严重的隐私问题。差异化私有生成模型(DPGM)通过生成私有化的敏感数据来避免此类隐私问题的解决方案。与其他差异私人(DP)学习者类似,DPGM的主要挑战也是如何在效用和隐私之间取得微妙的平衡。我们提出了DP $^2 $ -VAE,这是一种具有可证明的DP保证的变性自动编码器(VAE)的新型培训机制,并通过\ emph {pre-emph {pre-emph {prec-emph {pret-emph {pret-training}。在相同的DP约束下,DP $^2 $ -VAE最大程度地减少了训练过程中的扰动噪声,从而改善了实用性。 DP $^2 $ -VAE非常灵活,并且对许多其他VAE变体都很容易适应。从理论上讲,我们研究了预训练对私人数据的影响。从经验上讲,我们在图像数据集上进行了广泛的实验,以说明我们在各种隐私预算和评估指标下对基准的优越性。
translated by 谷歌翻译
机器学习的最新进展主要受益于大规模的可访问培训数据。但是,大规模的数据共享提出了极大的隐私问题。在这项工作中,我们提出了一种基于PAINE框架(G-PATE)的新型隐私保留数据生成模型,旨在训练可缩放的差异私有数据生成器,其保留高生成的数据实用程序。我们的方法利用生成的对抗性网来产生数据,与不同鉴别者之间的私人聚集相结合,以确保强烈的隐私保障。与现有方法相比,G-PATE显着提高了隐私预算的使用。特别是,我们用教师鉴别者的集合训练学生数据发生器,并提出一种新颖的私人梯度聚合机制,以确保对从教师鉴别者流到学生发电机的所有信息的差异隐私。另外,通过随机投影和梯度离散化,所提出的梯度聚合机制能够有效地处理高维梯度向量。从理论上讲,我们证明了G-PATE确保了数据发生器的差异隐私。经验上,我们通过广泛的实验证明了G-PAIN的优越性。我们展示了G-PATE是第一个能够在限量隐私预算下产生高数据实用程序的高维图像数据($ \ epsilon \ LE 1 $)。我们的代码可在https://github.com/ai-secure/gate上获得。
translated by 谷歌翻译
深度神经网络(DNNS)铰接对大型数据集的可用性的最新成功;但是,对此类数据集的培训经常为敏感培训信息构成隐私风险。在本文中,我们的目标是探讨生成模型和梯度稀疏性的力量,并提出了一种可扩展的隐私保留生成模型数据标准。与标准展示隐私保留框架相比,允许教师对一维预测进行投票,在高维梯度向量上投票在隐私保存方面具有挑战性。随着需要尺寸减少技术,我们需要在(1)之间的改进之间导航精致的权衡空间,并进行SGD收敛的放缓。为了解决这一点,我们利用通信高效学习,并通过将顶-K压缩与相应的噪声注入机构相结合,提出一种新的噪声压缩和聚集方法TopAGG。理论上,我们证明了DataLens框架保证了其生成数据的差异隐私,并提供了其收敛性的分析。为了展示DataLens的实际使用情况,我们对不同数据集进行广泛的实验,包括Mnist,Fashion-Mnist和高维Celeba,并且我们表明,DataLens显着优于其他基线DP生成模型。此外,我们改进了所提出的Topagg方法,该方法是DP SGD培训的主要构建块之一,并表明它能够在大多数情况下实现比最先进的DP SGD方法更高的效用案件。我们的代码在HTTPS://github.com/ai-secure/datalens公开提供。
translated by 谷歌翻译
虽然在巨大数据上培训的机器学习模型导致了几个领域的断路器,但由于限制数据的访问,他们在隐私敏感域中的部署仍然有限。在私有数据上具有隐私约束的生成模型可以避免此挑战,而是提供对私有数据的间接访问。我们提出DP-Sinkhorn,一种新的最优传输的生成方法,用于从具有差异隐私的私有数据学习数据分布。 DP-Sinkhorn以差别私人方式在模型和数据之间的模型和数据之间最小化陷阱的分歧,将计算上有效的近似值,并在模型和数据之间使用新技术来控制梯度估计的偏差差异的偏差折衷。与现有的培训方法不同,差异私人生成模型主要基于生成的对抗网络,我们不依赖于对抗性目标,这令人惊叹的难以优化,特别是在隐私约束所施加的噪声存在下。因此,DP-Sinkhorn易于训练和部署。通过实验,我们改进了多种图像建模基准的最先进,并显示了差异私有的信息RGB图像综合。项目页面:https://nv-tlabs.github.io/dp-sinkhorn。
translated by 谷歌翻译
Differentially private data generation techniques have become a promising solution to the data privacy challenge -- it enables sharing of data while complying with rigorous privacy guarantees, which is essential for scientific progress in sensitive domains. Unfortunately, restricted by the inherent complexity of modeling high-dimensional distributions, existing private generative models are struggling with the utility of synthetic samples. In contrast to existing works that aim at fitting the complete data distribution, we directly optimize for a small set of samples that are representative of the distribution under the supervision of discriminative information from downstream tasks, which is generally an easier task and more suitable for private training. Our work provides an alternative view for differentially private generation of high-dimensional data and introduces a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
translated by 谷歌翻译
为了保护培训生成的对抗网络(GaN)中的敏感数据,标准方法是使用差异的私有(DP)随机梯度下降方法,其中将受控噪声添加到梯度。输出合成样品的质量可能会受到不利影响,并且网络的训练甚至可能不会在这些噪声的存在下收敛。我们提出了差异私有的模型反演(DPMI)方法,其中私有数据首先通过公共发生器映射到潜在空间,然后是具有更好的收敛属性的低维DP-GaN。标准数据集CIFAR10和SVHN的实验结果以及自闭症筛选的面部地标数据集表明,我们的方法在同一隐私保证下,基于Incepion得分,FR \'Echet Inception距离和分类准确性的标准DP-GaN方法优于标准DP-GaN方法。
translated by 谷歌翻译
差异隐私(DP)提供了正式的隐私保证,以防止对手可以访问机器学习模型,从而从提取有关单个培训点的信息。最受欢迎的DP训练方法是差异私有随机梯度下降(DP-SGD),它通过在训练过程中注入噪声来实现这种保护。然而,以前的工作发现,DP-SGD通常会导致标准图像分类基准的性能显着降解。此外,一些作者假设DP-SGD在大型模型上固有地表现不佳,因为保留隐私所需的噪声规范与模型维度成正比。相反,我们证明了过度参数化模型上的DP-SGD可以比以前想象的要好得多。将仔细的超参数调整与简单技术结合起来,以确保信号传播并提高收敛速率,我们获得了新的SOTA,而没有额外数据的CIFAR-10,在81.4%的81.4%下(8,10^{ - 5}) - 使用40 -layer wide-Resnet,比以前的SOTA提高了71.7%。当对预训练的NFNET-F3进行微调时,我们在ImageNet(0.5,8*10^{ - 7})下达到了83.8%的TOP-1精度。此外,我们还在(8,8 \ cdot 10^{ - 7})下达到了86.7%的TOP-1精度,DP仅比当前的非私人SOTA仅4.3%。我们认为,我们的结果是缩小私人图像分类和非私有图像分类之间准确性差距的重要一步。
translated by 谷歌翻译
Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. This applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable. In this paper we apply \textit{directional privacy}, via a mechanism based on the von Mises-Fisher (VMF) distribution, to perturb gradients in terms of \textit{angular distance} so that gradient direction is broadly preserved. We show that this provides $\epsilon d$-privacy for deep learning training, rather than the $(\epsilon, \delta)$-privacy of the Gaussian mechanism; and that experimentally, on key datasets, the VMF mechanism can outperform the Gaussian in the utility-privacy trade-off.
translated by 谷歌翻译
梯度泄漏攻击被认为是深度学习中的邪恶隐私威胁之一,因为攻击者在迭代培训期间隐蔽了梯度更新,而不会影响模型培训质量,但又使用泄漏的梯度逐步重建敏感培训数据,具有高攻击成功率。虽然具有差异隐私的深度学习是发布具有差异隐私保障的深度学习模型的违法标准,但我们展示了具有固定隐私参数的差异私有算法易受梯度泄漏攻击的影响。本文调查了差异隐私(DP)的梯度泄漏弹性深度学习的替代方法。首先,我们分析了差异隐私的深度学习的现有实现,它使用固定噪声方差使用固定隐私参数将恒定噪声对所有层中的梯度注入恒定噪声。尽管提供了DP保证,但该方法遭受了低精度,并且很容易受到梯度泄漏攻击。其次,通过使用动态隐私参数,我们提出了一种梯度泄漏弹性深度学习方法,差异隐私保证。与导致恒定噪声方差导致的固定参数策略不同,不同的动态参数策略存在替代技术,以引入自适应噪声方差和自适应噪声注入,其与差别私有模型训练期间的梯度更新的趋势紧密对齐。最后,我们描述了四个互补指标来评估和比较替代方法。
translated by 谷歌翻译
在本文中,我们研究了具有差异隐私(DP)的学习图神经网络(GNN)的问题。我们提出了一种基于聚合扰动(GAP)的新型差异私有GNN,该GNN为GNN的聚合函数添加了随机噪声,以使单个边缘(边缘级隐私)或单个节点的存在统计上的存在及其所有邻接边缘( - 级别的隐私)。 GAP的新体系结构是根据私人学习的细节量身定制的,由三个单独的模块组成:(i)编码器模块,我们在不依赖边缘信息的情况下学习私人节点嵌入; (ii)聚合模块,其中我们根据图结构计算嘈杂的聚合节点嵌入; (iii)分类模块,我们在私有聚合上训练神经网络进行节点分类,而无需进一步查询图表。 GAP比以前的方法的主要优势在于,它可以从多跳社区的聚合中受益,并保证边缘级别和节点级别的DP不仅用于培训,而且可以推断出培训的隐私预算以外的额外费用。我们使用R \'Enyi DP来分析GAP的正式隐私保证,并在三个真实世界图数据集上进行经验实验。我们证明,与最先进的DP-GNN方法和天真的MLP基线相比,GAP提供了明显更好的准确性私人权衡权衡。
translated by 谷歌翻译
通过仅使用训练有素的分类器,模型内(MI)攻击可以恢复用于训练分类器的数据,从而导致培训数据的隐私泄漏。为了防止MI攻击,先前的工作利用单方面依赖优化策略,即,在培训分类器期间,最大程度地减少了输入(即功能)和输出(即标签)之间的依赖关系。但是,这样的最小化过程与最小化监督损失相冲突,该损失旨在最大程度地提高输入和输出之间的依赖关系,从而在模型鲁棒性针对MI攻击和模型实用程序上对分类任务进行明确的权衡。在本文中,我们旨在最大程度地减少潜在表示和输入之间的依赖性,同时最大化潜在表示和输出之间的依赖关系,称为双边依赖性优化(BIDO)策略。特别是,除了对深神经网络的常用损失(例如,跨渗透性)外,我们还将依赖性约束用作普遍适用的正常化程序,可以根据不同的任务将其实例化使用适当的依赖标准。为了验证我们策略的功效,我们通过使用两种不同的依赖性度量提出了两种BIDO的实施:具有约束协方差的Bido(Bido-Coco)(Bido-Coco)和Bido具有Hilbert-Schmidt独立标准(Bido-HSIC)。实验表明,比多(Bido防御MI攻击的道路。
translated by 谷歌翻译
联合学习(FL)提供了一个有效的范式,可以共同培训分布式用户的数据的全球模型。由于本地培训数据来自可能不值得信赖的不同用户,因此一些研究表明,FL容易受到中毒攻击的影响。同时,为了保护本地用户的隐私,FL始终以差异性私人方式(DPFL)进行培训。因此,在本文中,我们问:我们是否可以利用DPFL的先天隐私权来提供对中毒攻击的认证鲁棒性?我们可以进一步改善FL的隐私以改善这种认证吗?我们首先研究了FL的用户级和实例级别的隐私,并提出了新的机制以获得改进的实例级隐私。然后,我们提供两个鲁棒性认证标准:两级DPFL的认证预测和认证攻击成本。从理论上讲,我们证明了DPFL在有限数量的对抗用户或实例下的认证鲁棒性。从经验上讲,我们进行了广泛的实验,以在对不同数据集的一系列攻击下验证我们的理论。我们表明,具有更严格的隐私保证的DPFL总是在认证攻击成本方面提供更强的鲁棒性认证,但是在隐私保护和公用事业损失之间的适当平衡下,获得了最佳认证预测。
translated by 谷歌翻译
Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training. Prior works typically address this issue by using auxiliary information (e.g., public data) to boost the effectiveness of adaptive optimization. In this work, we explore techniques to estimate and efficiently adapt to gradient geometry in private adaptive optimization without auxiliary data. Motivated by the observation that adaptive methods can tolerate stale preconditioners, we propose differentially private adaptive training with delayed preconditioners (DP^2), a simple method that constructs delayed but less noisy preconditioners to better realize the benefits of adaptivity. Theoretically, we provide convergence guarantees for our method for both convex and non-convex problems, and analyze trade-offs between delay and privacy noise reduction. Empirically, we explore DP^2 across several real-world datasets, demonstrating that it can improve convergence speed by as much as 4x relative to non-adaptive baselines and match the performance of state-of-the-art optimization methods that require auxiliary data.
translated by 谷歌翻译
Deep neural networks have strong capabilities of memorizing the underlying training data, which can be a serious privacy concern. An effective solution to this problem is to train models with differential privacy, which provides rigorous privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among multiple participants, who jointly train a model through federated learning (FL), using both secure multiparty computation (MPC) to ensure the confidentiality of each gradient update, and differential privacy to avoid data leakage in the resulting model. A major challenge in this setting is that common mechanisms for enforcing DP in deep learning, which inject real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, most existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we propose Skellam mixture mechanism (SMM), an approach to enforce DP on models built via FL. Compared to existing methods, SMM eliminates the assumption that the input gradients must be integer-valued, and, thus, reduces the amount of noise injected to preserve DP. Further, SMM allows tight privacy accounting due to the nice composition and sub-sampling properties of the Skellam distribution, which are key to accurate deep learning with DP. The theoretical analysis of SMM is highly non-trivial, especially considering (i) the complicated math of differentially private deep learning in general and (ii) the fact that the mixture of two Skellam distributions is rather complex, and to our knowledge, has not been studied in the DP literature. Extensive experiments on various practical settings demonstrate that SMM consistently and significantly outperforms existing solutions in terms of the utility of the resulting model.
translated by 谷歌翻译
Pre-training large transformer models with in-domain data improves domain adaptation and helps gain performance on the domain-specific downstream tasks. However, sharing models pre-trained on potentially sensitive data is prone to adversarial privacy attacks. In this paper, we asked to which extent we can guarantee privacy of pre-training data and, at the same time, achieve better downstream performance on legal tasks without the need of additional labeled data. We extensively experiment with scalable self-supervised learning of transformer models under the formal paradigm of differential privacy and show that under specific training configurations we can improve downstream performance without sacrifying privacy protection for the in-domain data. Our main contribution is utilizing differential privacy for large-scale pre-training of transformer language models in the legal NLP domain, which, to the best of our knowledge, has not been addressed before.
translated by 谷歌翻译
差异隐私(DP)已被出现为严格的形式主义,以推理可量化的隐私泄漏。在机器学习(ML)中,已采用DP限制推理/披露训练示例。在现有的工作中杠杆横跨ML管道,尽管隔离,通常专注于梯度扰动等机制。在本文中,我们展示了DP-util,DP整体实用分析框架,跨越ML管道,重点是输入扰动,客观扰动,梯度扰动,输出扰动和预测扰动。在隐私敏感数据上给出ML任务,DP-Util使ML隐私从业者能够对DP在这五个扰动点中的影响,以模型公用事业丢失,隐私泄漏和真正透露的数量来测量DP的影响。训练样本。我们在视觉,医疗和金融数据集上使用两个代表性学习算法(Logistic回归和深神经网络)来评估DP-Uts,以防止会员资格推论攻击作为案例研究攻击。我们结果的一个亮点是,预测扰动一致地在所有数据集中始终如一地实现所有模型的最低实用损耗。在Logistic回归模型中,与其他扰动技术相比,客观扰动导致最低的隐私泄漏。对于深度神经网络,梯度扰动导致最低的隐私泄漏。此外,我们的结果揭示了记录的结果表明,由于隐私泄漏增加,差异私有模型揭示了更多数量的成员样本。总体而言,我们的研究结果表明,为了使使用的扰动机制有明智的决定,ML隐私从业者需要检查优化技术(凸与非凸),扰动机制,课程数量和隐私预算之间的动态。
translated by 谷歌翻译
鉴于对机器学习模型的访问,可以进行对手重建模型的培训数据?这项工作从一个强大的知情对手的镜头研究了这个问题,他们知道除了一个之外的所有培训数据点。通过实例化混凝土攻击,我们表明重建此严格威胁模型中的剩余数据点是可行的。对于凸模型(例如Logistic回归),重建攻击很简单,可以以封闭形式导出。对于更常规的模型(例如神经网络),我们提出了一种基于训练的攻击策略,该攻击策略接收作为输入攻击的模型的权重,并产生目标数据点。我们展示了我们对MNIST和CIFAR-10训练的图像分类器的攻击的有效性,并系统地研究了标准机器学习管道的哪些因素影响重建成功。最后,我们从理论上调查了有多差异的隐私足以通过知情对手减轻重建攻击。我们的工作提供了有效的重建攻击,模型开发人员可以用于评估超出以前作品中考虑的一般设置中的个别点的记忆(例如,生成语言模型或访问培训梯度);它表明,标准模型具有存储足够信息的能力,以实现培训数据点的高保真重建;它表明,差异隐私可以成功减轻该参数制度中的攻击,其中公用事业劣化最小。
translated by 谷歌翻译
Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality. * Google.† OpenAI. Work done while at Google.
translated by 谷歌翻译
提出测试释放(PTR)是一个差异隐私框架,可符合局部功能的敏感性,而不是其全球敏感性。该框架通常用于以差异性私有方式释放强大的统计数据,例如中位数或修剪平均值。尽管PTR是十年前引入的常见框架,但在诸如Robust SGD之类的应用程序中使用它,我们需要许多自适应鲁棒的查询是具有挑战性的。这主要是由于缺乏Renyi差异隐私(RDP)分析,这是一种瞬间的私人深度学习方法的基础。在这项工作中,我们概括了标准PTR,并在目标函数界定全局灵敏度时得出了第一个RDP。我们证明,与直接分析的$(\ eps,\ delta)$ -DP相比,我们的RDP绑定的PTR可以得出更严格的DP保证。我们还得出了亚采样下PTR的算法特异性隐私扩增。我们表明,我们的界限比一般的上限和接近下限的界限要紧密得多。我们的RDP界限可以为PTR的许多自适应运行的组成而更严格的隐私损失计算。作为我们的分析的应用,我们表明PTR和我们的理论结果可用于设计私人变体,用于拜占庭强大的训练算法,这些变体使用可靠的统计数据用于梯度聚集。我们对不同数据集和体系结构的标签,功能和梯度损坏的设置进行实验。我们表明,与基线相比,基于PTR的私人和强大的培训算法可显着改善该实用性。
translated by 谷歌翻译
Distributing machine learning predictors enables the collection of large-scale datasets while leaving sensitive raw data at trustworthy sites. We show that locally training support vector machines (SVMs) and computing their averages leads to a learning technique that is scalable to a large number of users, satisfies differential privacy, and is applicable to non-trivial tasks, such as CIFAR-10. For a large number of participants, communication cost is one of the main challenges. We achieve a low communication cost by requiring only a single invocation of an efficient secure multiparty summation protocol. By relying on state-of-the-art feature extractors (SimCLR), we are able to utilize differentially private convex learners for non-trivial tasks such as CIFAR-10. Our experimental results illustrate that for $1{,}000$ users with $50$ data points each, our scheme outperforms state-of-the-art scalable distributed learning methods (differentially private federated learning, short DP-FL) while requiring around $500$ times fewer communication costs: For CIFAR-10, we achieve a classification accuracy of $79.7\,\%$ for an $\varepsilon = 0.59$ while DP-FL achieves $57.6\,\%$. More generally, we prove learnability properties for the average of such locally trained models: convergence and uniform stability. By only requiring strongly convex, smooth, and Lipschitz-continuous objective functions, locally trained via stochastic gradient descent (SGD), we achieve a strong utility-privacy tradeoff.
translated by 谷歌翻译