联合学习是一种数据解散隐私化技术,用于以安全的方式执行机器或深度学习。在本文中,我们介绍了有关联合学习的理论方面客户次数有所不同的用例。具体而言,使用从开放数据存储库中获得的胸部X射线图像提出了医学图像分析的用例。除了与隐私相关的优势外,还将研究预测的改进(就曲线下的准确性和面积而言)和减少执行时间(集中式方法)。将从培训数据中模拟不同的客户,以不平衡的方式选择,即,他们并非都有相同数量的数据。考虑三个或十个客户之间的结果与集中案件相比。间歇性客户将分析两种遵循方法,就像在实际情况下,某些客户可能会离开培训,一些新的新方法可能会进入培训。根据准确性,曲线下的区域和执行时间的结果,结果的结果的演变显示为原始数据被划分的客户次数。最后,提出了该领域的改进和未来工作。
translated by 谷歌翻译
联邦学习一直是一个热门的研究主题,使不同组织的机器学习模型的协作培训在隐私限制下。随着研究人员试图支持更多具有不同隐私方法的机器学习模型,需要开发系统和基础设施,以便于开发各种联合学习算法。类似于Pytorch和Tensorflow等深度学习系统,可以增强深度学习的发展,联邦学习系统(FLSS)是等效的,并且面临各个方面的面临挑战,如有效性,效率和隐私。在本调查中,我们对联合学习系统进行了全面的审查。为实现流畅的流动和引导未来的研究,我们介绍了联合学习系统的定义并分析了系统组件。此外,我们根据六种不同方面提供联合学习系统的全面分类,包括数据分布,机器学习模型,隐私机制,通信架构,联合集市和联合的动机。分类可以帮助设计联合学习系统,如我们的案例研究所示。通过系统地总结现有联合学习系统,我们展示了设计因素,案例研究和未来的研究机会。
translated by 谷歌翻译
联合学习(FL)和分裂学习(SL)是两种新兴的协作学习方法,可能会极大地促进物联网(IoT)中无处不在的智能。联合学习使机器学习(ML)模型在本地培训的模型使用私人数据汇总为全球模型。分裂学习使ML模型的不同部分可以在学习框架中对不同工人进行协作培训。联合学习和分裂学习,每个学习都有独特的优势和各自的局限性,可能会相互补充,在物联网中无处不在的智能。因此,联合学习和分裂学习的结合最近成为一个活跃的研究领域,引起了广泛的兴趣。在本文中,我们回顾了联合学习和拆分学习方面的最新发展,并介绍了有关最先进技术的调查,该技术用于将这两种学习方法组合在基于边缘计算的物联网环境中。我们还确定了一些开放问题,并讨论了该领域未来研究的可能方向,希望进一步引起研究界对这个新兴领域的兴趣。
translated by 谷歌翻译
这项工作调查了联合学习的可能性,了解IOT恶意软件检测,并研究该新学习范式固有的安全问题。在此上下文中,呈现了一种使用联合学习来检测影响物联网设备的恶意软件的框架。 n-baiot,一个数据集在由恶意软件影响的几个实际物联网设备的网络流量,已被用于评估所提出的框架。经过培训和评估监督和无监督和无监督的联邦模型(多层Perceptron和AutoEncoder)能够检测到MATEN和UNEEN的IOT设备的恶意软件,并进行了培训和评估。此外,它们的性能与两种传统方法进行了比较。第一个允许每个参与者在本地使用自己的数据局面训练模型,而第二个包括使参与者与负责培训全局模型的中央实体共享他们的数据。这种比较表明,在联合和集中方法中完成的使用更多样化和大数据,对模型性能具有相当大的积极影响。此外,联邦模型,同时保留了参与者的隐私,将类似的结果与集中式相似。作为额外的贡献,并衡量联邦方法的稳健性,已经考虑了具有若干恶意参与者中毒联邦模型的对抗性设置。即使使用单个对手,大多数联邦学习算法中使用的基线模型聚合平均步骤也很容易受到不同攻击的影响。因此,在相同的攻击方案下评估了作为对策的其他模型聚合函数的性能。这些职能对恶意参与者提供了重大改善,但仍然需要更多的努力来使联邦方法强劲。
translated by 谷歌翻译
通信技术和互联网的最新进展与人工智能(AI)启用了智能医疗保健。传统上,由于现代医疗保健网络的高性性和日益增长的数据隐私问题,AI技术需要集中式数据收集和处理,这可能在现实的医疗环境中可能是不可行的。作为一个新兴的分布式协作AI范例,通过协调多个客户(例如,医院)来执行AI培训而不共享原始数据,对智能医疗保健特别有吸引力。因此,我们对智能医疗保健的使用提供了全面的调查。首先,我们在智能医疗保健中展示了近期进程,动机和使用FL的要求。然后讨论了近期智能医疗保健的FL设计,从资源感知FL,安全和隐私感知到激励FL和个性化FL。随后,我们对关键医疗领域的FL新兴应用提供了最先进的综述,包括健康数据管理,远程健康监测,医学成像和Covid-19检测。分析了几个最近基于智能医疗保健项目,并突出了从调查中学到的关键经验教训。最后,我们讨论了智能医疗保健未来研究的有趣研究挑战和可能的指示。
translated by 谷歌翻译
联邦学习的出现在维持隐私的同时,促进了机器学习模型之间的大规模数据交换。尽管历史悠久,但联邦学习正在迅速发展,以使更广泛的使用更加实用。该领域中最重要的进步之一是将转移学习纳入联邦学习,这克服了主要联合学习的基本限制,尤其是在安全方面。本章从安全的角度进行了有关联合和转移学习的交集的全面调查。这项研究的主要目标是发现可能损害使用联合和转移学习的系统的隐私和性能的潜在脆弱性和防御机制。
translated by 谷歌翻译
联合学习(FL)是一个系统,中央聚合器协调多个客户解决机器学习问题的努力。此设置允许分散培训数据以保护隐私。本文的目的是提供针对医疗保健的FL系统的概述。 FL在此根据其框架,架构和应用程序进行评估。这里显示的是,FL通过中央聚合器服务器通过共享的全球深度学习(DL)模型解决了前面的问题。本文研究了最新的发展,并提供了来自FL研究的快速增长的启发,列出了未解决的问题。在FL的背景下,描述了几种隐私方法,包括安全的多方计算,同态加密,差异隐私和随机梯度下降。此外,还提供了对各种FL类的综述,例如水平和垂直FL以及联合转移学习。 FL在无线通信,服务建议,智能医学诊断系统和医疗保健方面有应用,本文将在本文中进行讨论。我们还对现有的FL挑战进行了彻底的审查,例如隐私保护,沟通成本,系统异质性和不可靠的模型上传,然后是未来的研究指示。
translated by 谷歌翻译
随着物联网,AI和ML/DL算法的出现,数据驱动的医疗应用已成为一种有前途的工具,用于从医学数据设计可靠且可扩展的诊断和预后模型。近年来,这引起了从学术界到工业的广泛关注。这无疑改善了医疗保健提供的质量。但是,由于这些基于AI的医疗应用程序在满足严格的安全性,隐私和服务标准(例如低延迟)方面的困难,因此仍然采用较差。此外,医疗数据通常是分散的和私人的,这使得在人群之间产生强大的结果具有挑战性。联邦学习(FL)的最新发展使得以分布式方式训练复杂的机器学习模型成为可能。因此,FL已成为一个积极的研究领域,尤其是以分散的方式处理网络边缘的医疗数据,以保护隐私和安全问题。为此,本次调查论文重点介绍了数据共享是重大负担的医疗应用中FL技术的当前和未来。它还审查并讨论了当前的研究趋势及其设计可靠和可扩展模型的结果。我们概述了FL将军的统计问题,设备挑战,安全性,隐私问题及其在医疗领域的潜力。此外,我们的研究还集中在医疗应用上,我们重点介绍了全球癌症的负担以及有效利用FL来开发计算机辅助诊断工具来解决这些诊断工具。我们希望这篇评论是一个检查站,以彻底的方式阐明现有的最新最新作品,并为该领域提供开放的问题和未来的研究指示。
translated by 谷歌翻译
提出了联合学习(FL),以促进分布式环境中模型的培训。它支持(本地)数据隐私的保护,并使用本地资源进行模型培训。到目前为止,大多数研究一直致力于“核心问题”,例如机器学习算法对FL,数据隐私保护或处理客户之间不均匀数据分布的影响。此贡献锚定在实际的用例中,在这种情况下,FL将实际部署在生态系统的互联网中。因此,在文献中发现了一些流行的考虑之外,还需要考虑一些不同的问题。此外,引入了一种构建灵活和适应性的FL解决方案的体系结构。
translated by 谷歌翻译
联邦学习(FL)和分裂学习(SL)是两个流行的分布式机器学习方法。遵循模型到数据方案;客户培训和测试机器学习模型而不共享原始数据。由于客户端和服务器之间的机器学习模型架构,SL提供比FL更好的模型隐私。此外,分割模型使SL成为资源受限环境的更好选择。然而,由于基于中继的训练,SL表现在多个客户端的继电器训练引起的速度。在这方面,本文提出了一种名为Splitfed Learning(SFL)的新方法,该方法可分摊两种方法消除其固有缺点,以及包含差异隐私和PIXELD的精制架构配置,以增强数据隐私和模型鲁棒性。我们的分析和经验结果表明,(纯)SFL提供了类似的测试精度和通信效率,作为SL,同时每个全球时代显着降低其用于多个客户端的SL中的计算时间。此外,如SL在SL中,它的通信效率随着客户的数量而改善。此外,在扩展实验环境下进一步评估了具有隐私和鲁棒性度量的SFL的性能。
translated by 谷歌翻译
隐私法规法(例如GDPR)将透明度和安全性作为数据处理算法的设计支柱。在这种情况下,联邦学习是保护隐私的分布式机器学习的最具影响力的框架之一,从而实现了许多自然语言处理和计算机视觉任务的惊人结果。一些联合学习框架采用差异隐私,以防止私人数据泄漏到未经授权的政党和恶意攻击者。但是,许多研究突出了标准联邦学习对中毒和推理的脆弱性,因此引起了人们对敏感数据潜在风险的担忧。为了解决此问题,我们提出了SGDE,这是一种生成数据交换协议,可改善跨索洛联合会中的用户安全性和机器学习性能。 SGDE的核心是共享具有强大差异隐私的数据生成器,保证了对私人数据培训的培训,而不是通信显式梯度信息。这些发电机合成了任意大量数据,这些数据保留了私人样品的独特特征,但有很大差异。我们展示了将SGDE纳入跨核心联合网络如何提高对联邦学习最有影响力的攻击的弹性。我们在图像和表格数据集上测试我们的方法,利用β变量自动编码器作为数据生成器,并突出了对非生成数据的本地和联合学习的公平性和绩效改进。
translated by 谷歌翻译
如今,信息技术的发展正在迅速增长。在大数据时代,个人信息的隐私更加明显。主要的挑战是找到一种方法来确保在发布和分析数据时不会披露敏感的个人信息。在信任的第三方数据策展人的假设上建立了集中式差异隐私。但是,这个假设在现实中并不总是正确的。作为一种新的隐私保护模型,当地的差异隐私具有相对强大的隐私保证。尽管联邦学习相对是一种用于分布式学习的隐私方法,但它仍然引入了各种隐私问题。为了避免隐私威胁并降低沟通成本,我们建议将联合学习和当地差异隐私与动量梯度下降整合在一起,以提高机器学习模型的性能。
translated by 谷歌翻译
推荐系统被证明是提取与用户相关的内容帮助用户进行日常活动的宝贵工具(例如,找到相关的访问地点,要消费的内容,要购买的商品)。但是,为了有效,这些系统需要收集和分析大量个人数据(例如,位置检查,电影评分,点击率等),这使用户面临许多隐私威胁。在这种情况下,基于联合学习(FL)的推荐系统似乎是一个有前途的解决方案,可以在计算准确的建议的同时将个人数据保存在用户设备上时,是一个有前途的解决方案。但是,FL,因此基于FL的推荐系统,依靠中央服务器,除了容易受到攻击外,还可以遇到可伸缩性问题。为了解决这个问题,我们提出了基于八卦学习原理的分散推荐系统Pepper。在胡椒中,用户八卦模型更新并不同步。 Pepper的核心位于两个关键组成部分:一个个性化的同行采样协议,该协议保存在每个节点附近,这是与前者具有相似兴趣的节点的一部分,以及一个简单而有效的模型汇总功能,该功能构建了一个模型更适合每个用户。通过在三个实施两个用例的实验实验中进行实验:位置入住建议和电影推荐,我们证明我们的解决方案比其他分散的解决方案快42%收敛于42%与分散的竞争对手相比,长时间性能的命中率和高达21%的速度提高了21%。
translated by 谷歌翻译
Federated Learning (FL) has emerged as a promising distributed learning paradigm with an added advantage of data privacy. With the growing interest in having collaboration among data owners, FL has gained significant attention of organizations. The idea of FL is to enable collaborating participants train machine learning (ML) models on decentralized data without breaching privacy. In simpler words, federated learning is the approach of ``bringing the model to the data, instead of bringing the data to the mode''. Federated learning, when applied to data which is partitioned vertically across participants, is able to build a complete ML model by combining local models trained only using the data with distinct features at the local sites. This architecture of FL is referred to as vertical federated learning (VFL), which differs from the conventional FL on horizontally partitioned data. As VFL is different from conventional FL, it comes with its own issues and challenges. In this paper, we present a structured literature review discussing the state-of-the-art approaches in VFL. Additionally, the literature review highlights the existing solutions to challenges in VFL and provides potential research directions in this domain.
translated by 谷歌翻译
In terms of artificial intelligence, there are several security and privacy deficiencies in the traditional centralized training methods of machine learning models by a server. To address this limitation, federated learning (FL) has been proposed and is known for breaking down ``data silos" and protecting the privacy of users. However, FL has not yet gained popularity in the industry, mainly due to its security, privacy, and high cost of communication. For the purpose of advancing the research in this field, building a robust FL system, and realizing the wide application of FL, this paper sorts out the possible attacks and corresponding defenses of the current FL system systematically. Firstly, this paper briefly introduces the basic workflow of FL and related knowledge of attacks and defenses. It reviews a great deal of research about privacy theft and malicious attacks that have been studied in recent years. Most importantly, in view of the current three classification criteria, namely the three stages of machine learning, the three different roles in federated learning, and the CIA (Confidentiality, Integrity, and Availability) guidelines on privacy protection, we divide attack approaches into two categories according to the training stage and the prediction stage in machine learning. Furthermore, we also identify the CIA property violated for each attack method and potential attack role. Various defense mechanisms are then analyzed separately from the level of privacy and security. Finally, we summarize the possible challenges in the application of FL from the aspect of attacks and defenses and discuss the future development direction of FL systems. In this way, the designed FL system has the ability to resist different attacks and is more secure and stable.
translated by 谷歌翻译
In recent years the applications of machine learning models have increased rapidly, due to the large amount of available data and technological progress.While some domains like web analysis can benefit from this with only minor restrictions, other fields like in medicine with patient data are strongerregulated. In particular \emph{data privacy} plays an important role as recently highlighted by the trustworthy AI initiative of the EU or general privacy regulations in legislation. Another major challenge is, that the required training \emph{data is} often \emph{distributed} in terms of features or samples and unavailable for classicalbatch learning approaches. In 2016 Google came up with a framework, called \emph{Federated Learning} to solve both of these problems. We provide a brief overview on existing Methods and Applications in the field of vertical and horizontal \emph{Federated Learning}, as well as \emph{Fderated Transfer Learning}.
translated by 谷歌翻译
更广泛的覆盖范围和更好的解决方案延迟减少5G需要其与多访问边缘计算(MEC)技术的组合。分散的深度学习(DDL),如联邦学习和群体学习作为对数百万智能边缘设备的隐私保留数据处理的有希望的解决方案,利用了本地客户端网络内的多层神经网络的分布式计算,而无需披露原始本地培训数据。值得注意的是,在金融和医疗保健等行业中,谨慎维护交易和个人医疗记录的敏感数据,DDL可以促进这些研究所的合作,以改善培训模型的性能,同时保护参与客户的数据隐私。在本调查论文中,我们展示了DDL的技术基础,通过分散的学习使社会许多人走。此外,我们通过概述DDL的挑战以及从新颖的沟通效率和可靠性的观点来概述目前本领域最先进的全面概述。
translated by 谷歌翻译
In recent years, deep learning (DL) models have demonstrated remarkable achievements on non-trivial tasks such as speech recognition and natural language understanding. One of the significant contributors to its success is the proliferation of end devices that acted as a catalyst to provide data for data-hungry DL models. However, computing DL training and inference is the main challenge. Usually, central cloud servers are used for the computation, but it opens up other significant challenges, such as high latency, increased communication costs, and privacy concerns. To mitigate these drawbacks, considerable efforts have been made to push the processing of DL models to edge servers. Moreover, the confluence point of DL and edge has given rise to edge intelligence (EI). This survey paper focuses primarily on the fifth level of EI, called all in-edge level, where DL training and inference (deployment) are performed solely by edge servers. All in-edge is suitable when the end devices have low computing resources, e.g., Internet-of-Things, and other requirements such as latency and communication cost are important in mission-critical applications, e.g., health care. Firstly, this paper presents all in-edge computing architectures, including centralized, decentralized, and distributed. Secondly, this paper presents enabling technologies, such as model parallelism and split learning, which facilitate DL training and deployment at edge servers. Thirdly, model adaptation techniques based on model compression and conditional computation are described because the standard cloud-based DL deployment cannot be directly applied to all in-edge due to its limited computational resources. Fourthly, this paper discusses eleven key performance metrics to evaluate the performance of DL at all in-edge efficiently. Finally, several open research challenges in the area of all in-edge are presented.
translated by 谷歌翻译
Today's AI still faces two major challenges. One is that in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated learning framework, which includes horizontal federated learning, vertical federated learning and federated transfer learning. We provide definitions, architectures and applications for the federated learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.
translated by 谷歌翻译
In recent years, mobile devices are equipped with increasingly advanced sensing and computing capabilities. Coupled with advancements in Deep Learning (DL), this opens up countless possibilities for meaningful applications, e.g., for medical purposes and in vehicular networks. Traditional cloudbased Machine Learning (ML) approaches require the data to be centralized in a cloud server or data center. However, this results in critical issues related to unacceptable latency and communication inefficiency. To this end, Mobile Edge Computing (MEC) has been proposed to bring intelligence closer to the edge, where data is produced. However, conventional enabling technologies for ML at mobile edge networks still require personal data to be shared with external parties, e.g., edge servers. Recently, in light of increasingly stringent data privacy legislations and growing privacy concerns, the concept of Federated Learning (FL) has been introduced. In FL, end devices use their local data to train an ML model required by the server. The end devices then send the model updates rather than raw data to the server for aggregation. FL can serve as an enabling technology in mobile edge networks since it enables the collaborative training of an ML model and also enables DL for mobile edge network optimization. However, in a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved. This raises challenges of communication costs, resource allocation, and privacy and security in the implementation of FL at scale. In this survey, we begin with an introduction to the background and fundamentals of FL. Then, we highlight the aforementioned challenges of FL implementation and review existing solutions. Furthermore, we present the applications of FL for mobile edge network optimization. Finally, we discuss the important challenges and future research directions in FL.
translated by 谷歌翻译