联合学习(FL)是一个分布式学习范式,使相互不信任的客户能够协作培训通用的机器学习模型。客户数据隐私在FL中至关重要。同时,必须保护模型免受对抗客户的中毒攻击。现有解决方案孤立地解决了这两个问题。我们提出了FedPerm,这是一种新的FL算法,它通过结合一种新型的内部模型参数改组技术来解决这两个问题,该技术可以放大数据隐私,并基于私人信息检索(PIR)技术,该技术允许允许对客户模型更新的加密聚合。这些技术的组合进一步有助于联邦服务器约束从客户端的参数更新,从而减少对抗性客户的模型中毒攻击的影响。我们进一步介绍了Fedperm独特的超参数,可以有效地使用Model Utilities进行计算开销。我们对MNIST数据集的经验评估表明,FEDPERM对FL中现有差异隐私(DP)执法解决方案的有效性。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
联邦机器学习利用边缘计算来开发网络用户数据的模型,但联合学习的隐私仍然是一个重大挑战。已经提出了使用差异隐私的技术来解决这一点,但是带来了自己的挑战 - 许多人需要一个值得信赖的第三方,或者增加了太多的噪音来生产有用的模型。使用多方计算的\ EMPH {SERVE聚合}的最新进步消除了对第三方的需求,但是在计算上尤其在规模上昂贵。我们提出了一种新的联合学习协议,利用了一种基于与错误学习的技术的新颖差异私有的恶意安全聚合协议。我们的协议优于当前最先进的技术,并且经验结果表明它缩放到大量方面,具有任何差别私有联合学习方案的最佳精度。
translated by 谷歌翻译
Federated learning seeks to address the issue of isolated data islands by making clients disclose only their local training models. However, it was demonstrated that private information could still be inferred by analyzing local model parameters, such as deep neural network model weights. Recently, differential privacy has been applied to federated learning to protect data privacy, but the noise added may degrade the learning performance much. Typically, in previous work, training parameters were clipped equally and noises were added uniformly. The heterogeneity and convergence of training parameters were simply not considered. In this paper, we propose a differentially private scheme for federated learning with adaptive noise (Adap DP-FL). Specifically, due to the gradient heterogeneity, we conduct adaptive gradient clipping for different clients and different rounds; due to the gradient convergence, we add decreasing noises accordingly. Extensive experiments on real-world datasets demonstrate that our Adap DP-FL outperforms previous methods significantly.
translated by 谷歌翻译
Federated learning is a collaborative method that aims to preserve data privacy while creating AI models. Current approaches to federated learning tend to rely heavily on secure aggregation protocols to preserve data privacy. However, to some degree, such protocols assume that the entity orchestrating the federated learning process (i.e., the server) is not fully malicious or dishonest. We investigate vulnerabilities to secure aggregation that could arise if the server is fully malicious and attempts to obtain access to private, potentially sensitive data. Furthermore, we provide a method to further defend against such a malicious server, and demonstrate effectiveness against known attacks that reconstruct data in a federated learning setting.
translated by 谷歌翻译
Differentially private federated learning (DP-FL) has received increasing attention to mitigate the privacy risk in federated learning. Although different schemes for DP-FL have been proposed, there is still a utility gap. Employing central Differential Privacy in FL (CDP-FL) can provide a good balance between the privacy and model utility, but requires a trusted server. Using Local Differential Privacy for FL (LDP-FL) does not require a trusted server, but suffers from lousy privacy-utility trade-off. Recently proposed shuffle DP based FL has the potential to bridge the gap between CDP-FL and LDP-FL without a trusted server; however, there is still a utility gap when the number of model parameters is large. In this work, we propose OLIVE, a system that combines the merits from CDP-FL and LDP-FL by leveraging Trusted Execution Environment (TEE). Our main technical contributions are the analysis and countermeasures against the vulnerability of TEE in OLIVE. Firstly, we theoretically analyze the memory access pattern leakage of OLIVE and find that there is a risk for sparsified gradients, which is common in FL. Secondly, we design an inference attack to understand how the memory access pattern could be linked to the training data. Thirdly, we propose oblivious yet efficient algorithms to prevent the memory access pattern leakage in OLIVE. Our experiments on real-world data demonstrate that OLIVE is efficient even when training a model with hundreds of thousands of parameters and effective against side-channel attacks on TEE.
translated by 谷歌翻译
联合学习(FL),数据保留在联合客户端,并且仅与中央聚合器共享梯度更新是私人的。最近的工作表明,具有梯度级别访问权限的对手可以成功进行推理和重建攻击。在这种情况下,众所周知,差异化(DP)学习可以提供弹性。但是,现状中使用的方法(\ ie中央和本地DP)引入了不同的公用事业与隐私权衡权衡。在这项工作中,我们迈出了通过{\ em层次fl(HFL)}来缓解此类权衡的第一步。我们证明,通过引入一个新的中介层,可以添加校准的DP噪声,可以获得更好的隐私与公用事业权衡;我们称此{\ em层次结构DP(HDP)}。我们使用3个不同数据集的实验(通常用作FL的基准)表明HDP产生的模型与使用中央DP获得的模型一样准确,在中央聚集器处添加了噪声。这种方法还为推理对手提供了可比的好处,例如在本地DP案例中,在联合客户端添加了噪音。
translated by 谷歌翻译
我们考虑了一个联合表示的学习框架,在中央服务器的协助下,一组$ n $分布式客户通过其私人数据协作培训一组实体的表示(或嵌入)(例如,用户在一个中的用户社交网络)。在此框架下,对于以私人方式汇总在客户培训的本地嵌入的关键步骤,我们开发了一个名为SECEA的安全嵌入聚合协议,该协议为一组实体提供信息理论隐私保证,并在每个客户端提供相应的嵌入$同时$ $,对好奇的服务器和最多$ t <n/2 $勾结的客户。作为SECEA的第一步,联合学习系统执行了一个私人实体联盟,让每个客户在不知道哪个实体属于哪个客户的情况下学习系统中的所有实体。在每个聚合回合中,使用Lagrange插值在客户端中秘密共享本地嵌入,然后每个客户端构造编码的查询以检索预期实体的聚合嵌入。我们对各种表示的学习任务进行全面的实验,以评估SECEA的效用和效率,并从经验上证明,与没有(或具有较弱的)隐私保证的嵌入聚合协议相比,SECEA会造成可忽略的绩效损失(5%以内); SECEA的附加计算潜伏期减小,用于培训较大数据集的更深层次模型。
translated by 谷歌翻译
translated by 谷歌翻译
联合学习(FL)允许相互不信任的客户可以协作培训通用的机器学习模型,而无需共享其私人/专有培训数据。不幸的是,FL很容易受到恶意客户的中毒,他们旨在通过在FL培训过程中发送恶意模型更新来阻碍常见训练的模型的准确性。我们认为,对现有FL系统的中毒攻击成功的关键因素是客户可用的模型更新空间,使恶意客户可以通过解决优化问题来搜索最有毒的模型更新。为了解决这个问题,我们提出了联合排名学习(FRL)。 FRL将标准FL中的模型参数更新(浮点数连续空间)从模型参数更新(一个连续的空间)缩小到参数排名的空间(整数值的离散空间)。为了能够使用参数等级(而不是参数权重)训练全球模型,FRL利用了最近的SuperMasks培训机制的想法。具体而言,FRL客户端根据其本地培训数据对随机初始化的神经网络(由服务器提供)的参数进行排名。 FRL Server使用投票机制来汇总客户在每个培训时期提交的参数排名,以生成下一个培训时期的全球排名。从直觉上讲,我们基于投票的聚合机制阻止中毒客户对全球模型进行重大的对抗性修改,因为每个客户都会进行一次投票!我们通过分析证明和实验证明了FRL对中毒的鲁棒性。我们还显示了FRL的高沟通效率。我们的实验证明了FRL在现实世界中的优势。
translated by 谷歌翻译
translated by 谷歌翻译
Deep neural networks have strong capabilities of memorizing the underlying training data, which can be a serious privacy concern. An effective solution to this problem is to train models with differential privacy, which provides rigorous privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among multiple participants, who jointly train a model through federated learning (FL), using both secure multiparty computation (MPC) to ensure the confidentiality of each gradient update, and differential privacy to avoid data leakage in the resulting model. A major challenge in this setting is that common mechanisms for enforcing DP in deep learning, which inject real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, most existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we propose Skellam mixture mechanism (SMM), an approach to enforce DP on models built via FL. Compared to existing methods, SMM eliminates the assumption that the input gradients must be integer-valued, and, thus, reduces the amount of noise injected to preserve DP. Further, SMM allows tight privacy accounting due to the nice composition and sub-sampling properties of the Skellam distribution, which are key to accurate deep learning with DP. The theoretical analysis of SMM is highly non-trivial, especially considering (i) the complicated math of differentially private deep learning in general and (ii) the fact that the mixture of two Skellam distributions is rather complex, and to our knowledge, has not been studied in the DP literature. Extensive experiments on various practical settings demonstrate that SMM consistently and significantly outperforms existing solutions in terms of the utility of the resulting model.
translated by 谷歌翻译
最近,Niu,et。 al。介绍了一个新的联合学习(FL)的新变种​​,称为联邦子模型学习(FSL)。与传统的FL不同,每个客户端都会根据其私有数据在本地列出子模型(例如,从服务器检索),并在其选择到服务器时将子模型上载。然后所有客户端都会聚合所有子模型并完成迭代。不可避免地,FSL引入了两个隐私保留的计算任务,即私有子模型检索(PSR)和Secure Semodel聚合(SSA)。现有工作未能提供较少的亏损计划,或具有不切实际的效率。在这项工作中,我们利用分布式点函数(DPF)和Cuckoo Hashing来构建双服务器设置中的实用和轻量度安全FSL方案。更具体地说,我们提出了两个具有少量优化技术的基本协议,可确保我们对特定现实FSL任务的协议实用性。我们的实验表明,当重量尺寸$ \ LEQ 2 ^ {15} $时,我们所提出的协议可以在不到1分钟内完成,我们还通过与现有工作进行比较来展示协议效率,并通过处理真实世界的FSL任务。
translated by 谷歌翻译
联合学习(FL)使分布式设备能够共同培训共享模型,同时保持培训数据本地。与水平FL(HFL)设置不同,每个客户都有部分数据样本,即垂直FL(VFL),该设置允许每个客户收集部分特征,它最近吸引了密集的研究工作。在本文中,我们确定了最先进的VFL框架面临的两个挑战:(1)某些作品直接平均水平的学习功能嵌入,因此可能会失去每个本地功能集的独特属性; (2)服务器需要与客户进行每个培训步骤的梯度交流,从而产生高沟通成本,从而导致快速消费隐私预算。在本文中,我们旨在应对上述挑战,并提出一个具有多个线性头(VIM)框架的有效VFL,每个头部通过考虑每个客户的单独贡献来对应于本地客户。此外,我们提出了一种乘数的交替方向方法(ADMM)的方法来解决我们的优化问题,从而通过允许在每个步骤中允许多个本地更新来降低通信成本,从而在不同的隐私下导致更好的性能。我们考虑各种设置,包括具有模型分割的VFL,而无需模型分裂。对于这两种设置,我们仔细分析了框架的差异隐私机制。此外,我们表明我们框架的副产品是学习线性头的权重反映了当地客户的重要性。我们进行了广泛的评估,并表明,在四个现实世界数据集上,VIM与最先进的表现相比,vim的性能和更快的收敛性要高得多。我们还明确评估了本地客户的重要性,并表明VIM可以启用客户级解释和客户端Denoising等功能。
translated by 谷歌翻译
Distributing machine learning predictors enables the collection of large-scale datasets while leaving sensitive raw data at trustworthy sites. We show that locally training support vector machines (SVMs) and computing their averages leads to a learning technique that is scalable to a large number of users, satisfies differential privacy, and is applicable to non-trivial tasks, such as CIFAR-10. For a large number of participants, communication cost is one of the main challenges. We achieve a low communication cost by requiring only a single invocation of an efficient secure multiparty summation protocol. By relying on state-of-the-art feature extractors (SimCLR), we are able to utilize differentially private convex learners for non-trivial tasks such as CIFAR-10. Our experimental results illustrate that for $1{,}000$ users with $50$ data points each, our scheme outperforms state-of-the-art scalable distributed learning methods (differentially private federated learning, short DP-FL) while requiring around $500$ times fewer communication costs: For CIFAR-10, we achieve a classification accuracy of $79.7\,\%$ for an $\varepsilon = 0.59$ while DP-FL achieves $57.6\,\%$. More generally, we prove learnability properties for the average of such locally trained models: convergence and uniform stability. By only requiring strongly convex, smooth, and Lipschitz-continuous objective functions, locally trained via stochastic gradient descent (SGD), we achieve a strong utility-privacy tradeoff.
translated by 谷歌翻译
translated by 谷歌翻译
Federated learning facilitates the collaborative training of models without the sharing of raw data. However, recent attacks demonstrate that simply maintaining data locality during training processes does not provide sufficient privacy guarantees. Rather, we need a federated learning system capable of preventing inference over both the messages exchanged during training and the final trained model while ensuring the resulting model also has acceptable predictive accuracy. Existing federated learning approaches either use secure multiparty computation (SMC) which is vulnerable to inference or differential privacy which can lead to low accuracy given a large number of parties with relatively small amounts of data each. In this paper, we present an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs. Combining differential privacy with secure multiparty computation enables us to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Our system is therefore a scalable approach that protects against inference threats and produces models with high accuracy. Additionally, our system can be used to train a variety of machine learning models, which we validate with experimental results on 3 different machine learning algorithms. Our experiments demonstrate that our approach out-performs state of the art solutions. CCS CONCEPTS• Security and privacy → Privacy-preserving protocols; Trust frameworks; • Computing methodologies → Learning settings.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译