智能论文笔记

A Fair and Efficient Hybrid Federated Learning Framework based on XGBoost for Distributed Power Prediction

Haizhou Liu , Xuan Zhang , Xinwei Shen , Hongbin Sun

分类：机器学习 | 人工智能

2022-01-08

在现代电力系统中，关于发电/消耗的实时数据及其相关特征存储在各种分布式方中，包括家用仪表，变压器站和外部组织。为了充分利用这些分布式数据的潜在模式，以进行准确的功率预测，需要联合学习作为协作但隐私保留培训方案。然而，目前的联合学习框架偏振朝向解决数据的水平或垂直分离，并且倾向于忽略两个存在的情况。此外，在主流级联联合学习框架中，仅采用人工神经网络来学习数据模式，与表格数据集的基于树的模型相比，该数据模式被认为是更准确和解释的。为此，我们提出了一种基于XGBoost的混合联合学习框架，用于从实时外部功能的分布式电源预测。除了引入提升的树木以提高准确性和可解释性之外，我们还结合了水平和垂直的联邦学习，以解决特征在当地异构各方分散的场景，并在各种当地地区分散样品。此外，我们设计了动态任务分配方案，使得各方获得公平的信息份额，并且每个方的计算能力可以充分利用促进培训效率。提出了一个后续案例研究，以证明采用拟议框架的必要性。还确认了拟议框架的优点，效率和精度性能。

translated by 谷歌翻译

A Federated Learning Framework for Smart Grids: Securing Power Traces in Collaborative Learning

Haizhou Liu , Xuan Zhang , Xinwei Shen , Hongbin Sun

分类：机器学习

2021-03-22

随着智能传感器的部署和通信技术的进步，大数据分析在智能电网域中大大流行，告知利益相关者最好的电力利用策略。但是，这些电源相关数据被不同的各方存储和拥有。例如，功耗数据存储在跨城市的众多变压器站中;移动公司持有的人口的流动性数据，这是耗电量重要指标。直接数据分享可能会妥协党的福利，个人隐私甚至国家安全。灵感来自谷歌AI的联邦学习计划，我们向智能电网提出了联合学习框架，这使得能够协作学习功耗模式而不会泄漏各个电力迹线。当数据分散在样本空间中时，采用横向联合学习;另一方面，垂直联合学习是为散射在特征空间中的数据的情况而设计的。案例研究表明，通过适当的加密方案，如Paillier加密，从提出的框架构建的机器学习模型是无损，隐私保留和有效的。最后，讨论了智能电网其他方面的联合学习的有希望的未来，包括电动车辆，分布式发电/消费和集成能量系统。

translated by 谷歌翻译

Fed-EINI: An Efficient and Interpretable Inference Framework for Decision Tree Ensembles in Federated Learning

Xiaolin Chen , Shuai Zhou , Bei guan , Kai Yang , Hao Fan , Zejin Feng , Zhong Chen , Hu Wang , Yongji Wang

分类：机器学习 | 人工智能

2021-05-20

关于数据隐私和安全性的越来越多的担忧驱动了从孤立的数据源，即联合学习的隐私保留机学习的新兴领域。一类联合学习，\ Texit {垂直联合学习}，不同的各方对共同用户的不同特征，具有促进许多领域企业之间各种业务合作的潜力。在机器学习中，诸如梯度提升决策树（GBDT）和随机森林等决策树集合被广泛应用强大的型号，具有高的可解释性和建模效率。然而，最先进的垂直联合学习框架适应匿名功能以避免可能的数据泄露，使模型受到损害的可解释性。为了解决推理过程中的这个问题，在本文中，我们首先在垂直联合学习中对客场党的特征披露含义的必要性进行了问题分析。然后，我们发现树的预测结果可以表示为所有各方持有的树的子模型结果的交叉点。利用这种关键观察，我们通过隐藏决策路径来保护数据隐私并允许公开特征含义，并适应推理输出的通信有效的安全计算方法。通过理论分析和广泛的数值结果，将证明FED-EINI的优点。我们通过披露特征的含义来提高模型的可解释性，同时确保效率和准确性。

translated by 谷歌翻译

Federated XGBoost on Sample-Wise Non-IID Data

Katelinh Jones , Yuya Jeremy Ong , Yi Zhou , Nathalie Baracaldo

分类：机器学习 | 人工智能

2022-09-03

联合学习（FL）是以分散的方式共同训练机器学习算法的范式。 FL中的大多数研究都集中在基于神经网络的方法上，但是，由于克服算法的迭代和添加性特征的挑战，在联合学习中基于XGBoost的方法（例如XGBOOST）在联合学习中没有得到反应。基于决策树的模型，尤其是XGBoost，可以处理非IID数据，这对于联合学习框架中使用的算法很重要，因为数据的基本特征是分散的，并且具有本质上非IID的风险。在本文中，我们专注于研究通过对各种基于样本量的数据偏斜方案进行实验以及这些模型在各种非IID方案下的性能，通过非IID分布的影响如何受到非IID分布的影响。我们在多个不同的数据集中进行了一组广泛的实验，并进行了不同的数据偏斜分区。我们的实验结果表明，尽管有各种分区比率，但模型的性能保持一致，并且与以集中式方式训练的模型接近或同样良好。

translated by 谷歌翻译

A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

Qinbin Li , Zeyi Wen , Zhaomin Wu , Sixu Hu , Naibo Wang , Yuan Li , Xu Liu , Bingsheng He

分类：机器学习 | (统计)机器学习

2019-07-23

联邦学习一直是一个热门的研究主题，使不同组织的机器学习模型的协作培训在隐私限制下。随着研究人员试图支持更多具有不同隐私方法的机器学习模型，需要开发系统和基础设施，以便于开发各种联合学习算法。类似于Pytorch和Tensorflow等深度学习系统，可以增强深度学习的发展，联邦学习系统（FLSS）是等效的，并且面临各个方面的面临挑战，如有效性，效率和隐私。在本调查中，我们对联合学习系统进行了全面的审查。为实现流畅的流动和引导未来的研究，我们介绍了联合学习系统的定义并分析了系统组件。此外，我们根据六种不同方面提供联合学习系统的全面分类，包括数据分布，机器学习模型，隐私机制，通信架构，联合集市和联合的动机。分类可以帮助设计联合学习系统，如我们的案例研究所示。通过系统地总结现有联合学习系统，我们展示了设计因素，案例研究和未来的研究机会。

translated by 谷歌翻译

Federated Learning in Mobile Edge Networks: A Comprehensive Survey

Wei Yang Bryan Lim , Nguyen Cong Luong , Dinh Thai Hoang , Yutao Jiao , Ying-Chang Liang , Qiang Yang , Dusit Niyato , Chunyan Miao

分类：

2019-09-26

In recent years, mobile devices are equipped with increasingly advanced sensing and computing capabilities. Coupled with advancements in Deep Learning (DL), this opens up countless possibilities for meaningful applications, e.g., for medical purposes and in vehicular networks. Traditional cloudbased Machine Learning (ML) approaches require the data to be centralized in a cloud server or data center. However, this results in critical issues related to unacceptable latency and communication inefficiency. To this end, Mobile Edge Computing (MEC) has been proposed to bring intelligence closer to the edge, where data is produced. However, conventional enabling technologies for ML at mobile edge networks still require personal data to be shared with external parties, e.g., edge servers. Recently, in light of increasingly stringent data privacy legislations and growing privacy concerns, the concept of Federated Learning (FL) has been introduced. In FL, end devices use their local data to train an ML model required by the server. The end devices then send the model updates rather than raw data to the server for aggregation. FL can serve as an enabling technology in mobile edge networks since it enables the collaborative training of an ML model and also enables DL for mobile edge network optimization. However, in a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved. This raises challenges of communication costs, resource allocation, and privacy and security in the implementation of FL at scale. In this survey, we begin with an introduction to the background and fundamentals of FL. Then, we highlight the aforementioned challenges of FL implementation and review existing solutions. Furthermore, we present the applications of FL for mobile edge network optimization. Finally, we discuss the important challenges and future research directions in FL.

translated by 谷歌翻译

Combined Federated and Split Learning in Edge Computing for Ubiquitous Intelligence in Internet of Things: State of the Art and Future Directions

Qiang Duan , Shijing Hu , Ruijun Deng , Zhihui Lu

分类：机器学习

2022-07-20

联合学习（FL）和分裂学习（SL）是两种新兴的协作学习方法，可能会极大地促进物联网（IoT）中无处不在的智能。联合学习使机器学习（ML）模型在本地培训的模型使用私人数据汇总为全球模型。分裂学习使ML模型的不同部分可以在学习框架中对不同工人进行协作培训。联合学习和分裂学习，每个学习都有独特的优势和各自的局限性，可能会相互补充，在物联网中无处不在的智能。因此，联合学习和分裂学习的结合最近成为一个活跃的研究领域，引起了广泛的兴趣。在本文中，我们回顾了联合学习和拆分学习方面的最新发展，并介绍了有关最先进技术的调查，该技术用于将这两种学习方法组合在基于边缘计算的物联网环境中。我们还确定了一些开放问题，并讨论了该领域未来研究的可能方向，希望进一步引起研究界对这个新兴领域的兴趣。

translated by 谷歌翻译

Federated Machine Learning: Concept and Applications

Qiang Yang , Yang Liu , Tianjian Chen , Yongxin Tong

分类：

2019-02-13

Today's AI still faces two major challenges. One is that in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated learning framework, which includes horizontal federated learning, vertical federated learning and federated transfer learning. We provide definitions, architectures and applications for the federated learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.

translated by 谷歌翻译

Federated Learning for 5G Base Station Traffic Forecasting

Vasileios Perifanis , Nikolaos Pavlidis , Remous-Aris Koutsiamanis , Pavlos S. Efraimidis

分类：机器学习 | 人工智能

2022-11-28

Mobile traffic prediction is of great importance on the path of enabling 5G mobile networks to perform smart and efficient infrastructure planning and management. However, available data are limited to base station logging information. Hence, training methods for generating high-quality predictions that can generalize to new observations on different parties are in demand. Traditional approaches require collecting measurements from different base stations and sending them to a central entity, followed by performing machine learning operations using the received data. The dissemination of local observations raises privacy, confidentiality, and performance concerns, hindering the applicability of machine learning techniques. Various distributed learning methods have been proposed to address this issue, but their application to traffic prediction has yet to be explored. In this work, we study the effectiveness of federated learning applied to raw base station aggregated LTE data for time-series forecasting. We evaluate one-step predictions using 5 different neural network architectures trained with a federated setting on non-iid data. The presented algorithms have been submitted to the Global Federated Traffic Prediction for 5G and Beyond Challenge. Our results show that the learning architectures adapted to the federated setting achieve equivalent prediction error to the centralized setting, pre-processing techniques on base stations lead to higher forecasting accuracy, while state-of-the-art aggregators do not outperform simple approaches.

translated by 谷歌翻译

Vertical Federated Learning: A Structured Literature Review

Afsana Khan , Marijn ten Thij , Anna Wilbik

分类：机器学习 | 人工智能

2022-12-01

Federated Learning (FL) has emerged as a promising distributed learning paradigm with an added advantage of data privacy. With the growing interest in having collaboration among data owners, FL has gained significant attention of organizations. The idea of FL is to enable collaborating participants train machine learning (ML) models on decentralized data without breaching privacy. In simpler words, federated learning is the approach of ``bringing the model to the data, instead of bringing the data to the mode''. Federated learning, when applied to data which is partitioned vertically across participants, is able to build a complete ML model by combining local models trained only using the data with distinct features at the local sites. This architecture of FL is referred to as vertical federated learning (VFL), which differs from the conventional FL on horizontally partitioned data. As VFL is different from conventional FL, it comes with its own issues and challenges. In this paper, we present a structured literature review discussing the state-of-the-art approaches in VFL. Additionally, the literature review highlights the existing solutions to challenges in VFL and provides potential research directions in this domain.

translated by 谷歌翻译

Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost

Wuxing Xu , Hao Fan , Kaixin Li , Kai Yang

分类：人工智能 | 机器学习

2021-12-08

越来越多的内容和机构努力使用外部数据来提高AI服务的性能。为了解决数据隐私和安全问题，联合学习吸引了学术界和工业的越来越多的关注，以安全地构建跨多个隔离数据提供商的AI模型。在本文中，我们研究了在现实世界应用中扩展广泛使用的XGBoost模型的效率问题，以垂直联合学习设置。最先进的垂直联合XGBoost框架需要大量的加密操作和密文传输，这使得模型培训比在本地培训XGBoost模型的效率更少。为了弥合这一差距，我们提出了一种新型批量均匀加密方法，以降低加密相关的计算和传输成本。这是通过将一阶导数和二阶导数编码成单个号码以进行加密，密文传输和同型添加操作来实现。可以从编码值的总和同时解码多个一阶导数和二阶导数的总和。我们在批量联合学习的Batchcrypt工作中受到了批量思想，并设计了一种新的批处理方法来解决允许相当数量的负数的限制。所提出的批处理方法的编码过程由四个步骤组成，包括转换，截断，量化和批量，而解码过程包括去量化和移位。通过理论分析和广泛的数值实验证明了我们的方法的优点。

translated by 谷歌翻译

The OARF Benchmark Suite: Characterization and Implications for Federated Learning Systems

Sixu Hu , Yuan Li , Xu Liu , Qinbin Li , Zhaomin Wu , Bingsheng He

分类：机器学习 | (统计)机器学习

2020-06-14

本文提出并表征了联合学习（OARF）的开放应用程序存储库，是联合机器学习系统的基准套件。以前可用的联合学习基准主要集中在合成数据集上，并使用有限数量的应用程序。 OARF模仿更现实的应用方案，具有公开的数据集，如图像，文本和结构数据中的不同数据孤岛。我们的表征表明，基准套件在数据大小，分布，特征分布和学习任务复杂性中多样化。与参考实施的广泛评估显示了联合学习系统的重要方面的未来研究机会。我们开发了参考实现，并评估了联合学习的重要方面，包括模型准确性，通信成本，吞吐量和收敛时间。通过这些评估，我们发现了一些有趣的发现，例如联合学习可以有效地提高端到端吞吐量。

translated by 谷歌翻译

Federated learning: Applications, challenges and future directions

Subrato Bharati , M. Rubaiyat Hossain Mondal , Prajoy Podder , V. B. Surya Prasath

分类：机器学习 | 人工智能

2022-05-18

联合学习（FL）是一个系统，中央聚合器协调多个客户解决机器学习问题的努力。此设置允许分散培训数据以保护隐私。本文的目的是提供针对医疗保健的FL系统的概述。 FL在此根据其框架，架构和应用程序进行评估。这里显示的是，FL通过中央聚合器服务器通过共享的全球深度学习（DL）模型解决了前面的问题。本文研究了最新的发展，并提供了来自FL研究的快速增长的启发，列出了未解决的问题。在FL的背景下，描述了几种隐私方法，包括安全的多方计算，同态加密，差异隐私和随机梯度下降。此外，还提供了对各种FL类的综述，例如水平和垂直FL以及联合转移学习。 FL在无线通信，服务建议，智能医学诊断系统和医疗保健方面有应用，本文将在本文中进行讨论。我们还对现有的FL挑战进行了彻底的审查，例如隐私保护，沟通成本，系统异质性和不可靠的模型上传，然后是未来的研究指示。

translated by 谷歌翻译

Edge-Native Intelligence for 6G Communications Driven by Federated Learning: A Survey of Trends and Challenges

Mohammad Al-Quraan , Lina Mohjazi , Lina Bariah , Anthony Centeno , Ahmed Zoha , Sami Muhaidat , Mérouane Debbah , Muhammad Ali Imran

分类：人工智能

2021-11-14

使用人工智能（AI）赋予无线网络中数据量的前所未有的数据量激增，为提供无处不在的数据驱动智能服务而开辟了新的视野。通过集中收集数据集和培训模型来实现传统的云彩中心学习（ML）基础的服务。然而，这种传统的训练技术包括两个挑战：（i）由于数据通信增加而导致的高通信和能源成本，（ii）通过允许不受信任的各方利用这些信息来威胁数据隐私。最近，鉴于这些限制，一种新兴的新兴技术，包括联合学习（FL），以使ML带到无线网络的边缘。通过以分布式方式培训全局模型，可以通过FL Server策划的全局模型来提取数据孤岛的好处。 FL利用分散的数据集和参与客户的计算资源，在不影响数据隐私的情况下开发广义ML模型。在本文中，我们介绍了对FL的基本面和能够实现技术的全面调查。此外，提出了一个广泛的研究，详细说明了无线网络中的流体的各种应用，并突出了他们的挑战和局限性。进一步探索了FL的疗效，其新兴的前瞻性超出了第五代（B5G）和第六代（6G）通信系统。本调查的目的是在关键的无线技术中概述了流动的技术，这些技术将作为建立对该主题的坚定了解的基础。最后，我们向未来的研究方向提供前进的道路。

translated by 谷歌翻译

Federated Learning: Challenges, Methods, and Future Directions

Tian Li , Anit Kumar Sahu , Ameet Talwalkar , Virginia Smith

分类：

2019-08-21

Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. Training in heterogeneous and potentially massive networks introduces novel challenges that require a fundamental departure from standard approaches for large-scale machine learning, distributed optimization, and privacy-preserving data analysis. In this article, we discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.

translated by 谷歌翻译

Federated learning and next generation wireless communications: A survey on bidirectional relationship

Debaditya Shome , Omer Waqar , Wali Ullah Khan

分类：机器学习

2021-10-14

为了满足下一代无线通信网络的极其异构要求，研究界越来越依赖于使用机器学习解决方案进行实时决策和无线电资源管理。传统的机器学习采用完全集中的架构，其中整个培训数据在一个节点上收集，即云服务器，显着提高了通信开销，并提高了严重的隐私问题。迄今为止，最近提出了作为联合学习（FL）称为联合学习的分布式机器学习范式。在FL中，每个参与边缘设备通过使用自己的培训数据列举其本地模型。然后，通过无线信道，本地训练模型的权重或参数被发送到中央ps，聚合它们并更新全局模型。一方面，FL对优化无线通信网络的资源起着重要作用，另一方面，无线通信对于FL至关重要。因此，FL和无线通信之间存在“双向”关系。虽然FL是一个新兴的概念，但许多出版物已经在FL的领域发表了发布及其对下一代无线网络的应用。尽管如此，我们注意到没有任何作品突出了FL和无线通信之间的双向关系。因此，本调查纸的目的是通过提供关于FL和无线通信之间的相互依存性的及时和全面的讨论来弥合文学中的这种差距。

translated by 谷歌翻译

A Hybrid Approach to Privacy-Preserving Federated Learning

Stacey Truex , Nathalie Baracaldo , Ali Anwar , Thomas Steinke , Heiko Ludwig , Rui Zhang , Yi Zhou

分类：

2018-12-07

Federated learning facilitates the collaborative training of models without the sharing of raw data. However, recent attacks demonstrate that simply maintaining data locality during training processes does not provide sufficient privacy guarantees. Rather, we need a federated learning system capable of preventing inference over both the messages exchanged during training and the final trained model while ensuring the resulting model also has acceptable predictive accuracy. Existing federated learning approaches either use secure multiparty computation (SMC) which is vulnerable to inference or differential privacy which can lead to low accuracy given a large number of parties with relatively small amounts of data each. In this paper, we present an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs. Combining differential privacy with secure multiparty computation enables us to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Our system is therefore a scalable approach that protects against inference threats and produces models with high accuracy. Additionally, our system can be used to train a variety of machine learning models, which we validate with experimental results on 3 different machine learning algorithms. Our experiments demonstrate that our approach out-performs state of the art solutions. CCS CONCEPTS• Security and privacy → Privacy-preserving protocols; Trust frameworks; • Computing methodologies → Learning settings.

translated by 谷歌翻译

Orchestrating Collaborative Cybersecurity: A Secure Framework for Distributed Privacy-Preserving Threat Intelligence Sharing

Juan R. Trocoso-Pastoriza , Alain Mermoud , Romain Bouyé , Francesco Marino , Jean-Philippe Bossuat , Vincent Lenders , Jean-Pierre Hubaux

分类：人工智能

2022-09-06

网络威胁情报（CTI）共享是减少攻击者和捍卫者之间信息不对称的重要活动。但是，由于数据共享和机密性之间的紧张关系，这项活动带来了挑战，这导致信息保留通常会导致自由骑士问题。因此，共享的信息仅代表冰山一角。当前的文献假设访问包含所有信息的集中数据库，但是由于上述张力，这并不总是可行的。这会导致不平衡或不完整的数据集，需要使用技术扩展它们。我们展示了这些技术如何导致结果和误导性能期望。我们提出了一个新颖的框架，用于从分布式数据中提取有关事件，漏洞和妥协指标的分布式数据，并与恶意软件信息共享平台（MISP）一起证明其在几种实际情况下的使用。提出和讨论了CTI共享的政策影响。拟议的系统依赖于隐私增强技术和联合处理的有效组合。这使组织能够控制其CTI，并最大程度地减少暴露或泄漏的风险，同时为共享的好处，更准确和代表性的结果以及更有效的预测性和预防性防御能力。

translated by 谷歌翻译

Federated Learning for Smart Healthcare: A Survey

Dinh C. Nguyen , Quoc-Viet Pham , Pubudu N. Pathirana , Ming Ding , Aruna Seneviratne , Zihuai Lin , Octavia A. Dobre , Won-Joo Hwang

分类：机器学习

2021-11-16

通信技术和互联网的最新进展与人工智能（AI）启用了智能医疗保健。传统上，由于现代医疗保健网络的高性性和日益增长的数据隐私问题，AI技术需要集中式数据收集和处理，这可能在现实的医疗环境中可能是不可行的。作为一个新兴的分布式协作AI范例，通过协调多个客户（例如，医院）来执行AI培训而不共享原始数据，对智能医疗保健特别有吸引力。因此，我们对智能医疗保健的使用提供了全面的调查。首先，我们在智能医疗保健中展示了近期进程，动机和使用FL的要求。然后讨论了近期智能医疗保健的FL设计，从资源感知FL，安全和隐私感知到激励FL和个性化FL。随后，我们对关键医疗领域的FL新兴应用提供了最先进的综述，包括健康数据管理，远程健康监测，医学成像和Covid-19检测。分析了几个最近基于智能医疗保健项目，并突出了从调查中学到的关键经验教训。最后，我们讨论了智能医疗保健未来研究的有趣研究挑战和可能的指示。

translated by 谷歌翻译

Federated Learning with Hyperparameter-based Clustering for Electrical Load Forecasting

Nastaran Gholizadeh , Petr Musilek

分类：机器学习

2021-11-14

电负载预测已成为电力系统操作的组成部分。深入学习模型为此目的被发现。然而，为了达到期望的预测准确性，它们需要大量的培训数据。分享负载预测的各个家庭的电力消耗数据可能会损害用户隐私，并且在通信资源方面可能是昂贵的。因此，诸如联邦学习的边缘计算方法正在为此目的获得更多重要性。这些方法可以利用数据，而无需集中存储它。本文评估了联合学习对单个房屋负荷的短期预测以及总负荷的表现。它通过将其与集中和局部学习方案进行比较来讨论该方法的优点和缺点。此外，提出了一种新的客户端聚类方法，以减少联合学习的收敛时间。结果表明，联合学习具有良好的性能，具有0.117kWh的最小根均匀误差（RMSE），为单独的负载预测。

translated by 谷歌翻译