智能论文笔记

Knowledge Distillation for Federated Learning: a Practical Guide

Alessio Mora , Irene Tenison , Paolo Bellavista , Irina Rish

分类：机器学习

2022-11-09

Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data. This paves the way for stronger privacy guarantees when building predictive models. The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits: (i) Clients must implement the same model architecture; (ii) Transmitting model weights and model updates implies high communication cost, which scales up with the number of model parameters; (iii) In presence of non-IID data distributions, parameter-averaging aggregation schemes perform poorly due to client model drifts. Federated adaptations of regular Knowledge Distillation (KD) can solve and/or mitigate the weaknesses of parameter-averaging FL algorithms while possibly introducing other trade-offs. In this article, we provide a review of KD-based algorithms tailored for specific FL issues.

translated by 谷歌翻译

FedDTG:Federated Data-Free Knowledge Distillation via Three-Player Generative Adversarial Networks

Zhenyuan Zhang

分类：机器学习

2022-01-10

将知识蒸馏应用于个性化的跨筒仓联合学习，可以很好地减轻用户异质性的问题。然而，这种方法需要一个代理数据集，这很难在现实世界中获得。此外，基于参数平均的全球模型将导致用户隐私的泄漏。我们介绍了一个分布式的三位玩家GaN来实现客户之间的DataFree共蒸馏。该技术减轻了用户异质性问题，更好地保护用户隐私。我们证实，GaN产生的方法可以使联合蒸馏更有效和稳健，并且在获得全球知识的基础上，共蒸馏可以为各个客户达到良好的性能。我们对基准数据集的广泛实验证明了与最先进的方法的卓越的泛化性能。

translated by 谷歌翻译

Parameterized Knowledge Transfer for Personalized Federated Learning

Jie Zhang , Song Guo , Xiaosong Ma , Haozhao Wang , Wencao Xu , Feijie Wu

分类：机器学习

2021-11-04

近年来，个性化联邦学习（PFL）引起了越来越关注其在客户之间处理统计异质性的潜力。然而，最先进的PFL方法依赖于服务器端的模型参数聚合，这需要所有模型具有相同的结构和大小，因此限制了应用程序以实现更多异构场景。要处理此类模型限制，我们利用异构模型设置的潜力，并提出了一种新颖的培训框架，为不同客户使用个性化模型。具体而言，我们将原始PFL中的聚合过程分为个性化组知识转移训练算法，即KT-PFL，这使得每个客户端能够在服务器端维护个性化软预测以指导其他人的本地培训。 KT-PFL通过使用知识系数矩阵的所有本地软预测的线性组合更新每个客户端的个性化软预测，这可以自适应地加强拥有类似数据分布的客户端之间的协作。此外，为了量化每个客户对他人的个性化培训的贡献，知识系数矩阵是参数化的，以便可以与模型同时培训。知识系数矩阵和模型参数在每轮梯度下降方式之后的每一轮中可替代地更新。在不同的设置（异构模型和数据分布）下进行各种数据集（EMNIST，Fashion \ _Mnist，CIFAR-10）的广泛实验。据证明，所提出的框架是第一个通过参数化群体知识转移实现个性化模型培训的联邦学习范例，同时实现与最先进的算法比较的显着性能增益。

translated by 谷歌翻译

Resource-aware Federated Learning using Knowledge Extraction and Multi-model Fusion

Sixing Yu , Wei Qian , Ali Jannesari

分类：机器学习

2022-08-16

随着对用户数据隐私的越来越关注，联合学习（FL）已被开发为在边缘设备上训练机器学习模型的独特培训范式，而无需访问敏感数据。传统的FL和现有方法直接在云服务器的同一型号和培训设备的所有边缘上采用聚合方法。尽管这些方法保护了数据隐私，但它们不能具有模型异质性，甚至忽略了异质的计算能力，也可以忽略陡峭的沟通成本。在本文中，我们目的是将资源感知的FL汇总为从边缘模型中提取的本地知识的集合，而不是汇总每个本地模型的权重，然后将其蒸馏成一个强大的全局知识，作为服务器模型通过知识蒸馏。通过深入的相互学习，将本地模型和全球知识提取到很小的知识网络中。这种知识提取使Edge客户端可以部署资源感知模型并执行多模型知识融合，同时保持沟通效率和模型异质性。经验结果表明，在异质数据和模型中的通信成本和概括性能方面，我们的方法比现有的FL算法有了显着改善。我们的方法将VGG-11的沟通成本降低了102美元$ \ times $和Resnet-32，当培训Resnet-20作为知识网络时，最多可达30美元$ \ times $。

translated by 谷歌翻译

Multi-Level Branched Regularization for Federated Learning

Jinkyu Kim , Geeho Kim , Bohyung Han

分类：机器学习

2022-07-14

联合学习的一个关键挑战是客户之间的数据异质性和失衡，这导致本地网络与全球模型不稳定的融合之间的不一致。为了减轻局限性，我们提出了一种新颖的建筑正则化技术，该技术通过在几个不同级别上接管本地和全球子网，在每个本地模型中构建多个辅助分支通过在线知识蒸馏。该提出的技术即使在非IID环境中也可以有效地鲁棒化，并且适用于各种联合学习框架，而不会产生额外的沟通成本。与现有方法相比，我们进行了全面的经验研究，并在准确性和效率方面表现出显着的性能提高。源代码可在我们的项目页面上找到。

translated by 谷歌翻译

Closing the Gap between Client and Global Model Performance in Heterogeneous Federated Learning

Hongrui Shi , Valentin Radu , Po Yang

分类：机器学习

2022-11-07

The heterogeneity of hardware and data is a well-known and studied problem in the community of Federated Learning (FL) as running under heterogeneous settings. Recently, custom-size client models trained with Knowledge Distillation (KD) has emerged as a viable strategy for tackling the heterogeneity challenge. However, previous efforts in this direction are aimed at client model tuning rather than their impact onto the knowledge aggregation of the global model. Despite performance of global models being the primary objective of FL systems, under heterogeneous settings client models have received more attention. Here, we provide more insights into how the chosen approach for training custom client models has an impact on the global model, which is essential for any FL application. We show the global model can fully leverage the strength of KD with heterogeneous data. Driven by empirical observations, we further propose a new approach that combines KD and Learning without Forgetting (LwoF) to produce improved personalised models. We bring heterogeneous FL on pair with the mighty FedAvg of homogeneous FL, in realistic deployment scenarios with dropping clients.

translated by 谷歌翻译

FedICT: Federated Multi-task Distillation for Multi-access Edge Computing

Zhiyuan Wu , Sheng Sun , Yuwei Wang , Min Liu , Xuefeng Jiang , Bo Gao

分类：机器学习

2023-01-01

The growing interest in intelligent services and privacy protection for mobile devices has given rise to the widespread application of federated learning in Multi-access Edge Computing (MEC). Diverse user behaviors call for personalized services with heterogeneous Machine Learning (ML) models on different devices. Federated Multi-task Learning (FMTL) is proposed to train related but personalized ML models for different devices, whereas previous works suffer from excessive communication overhead during training and neglect the model heterogeneity among devices in MEC. Introducing knowledge distillation into FMTL can simultaneously enable efficient communication and model heterogeneity among clients, whereas existing methods rely on a public dataset, which is impractical in reality. To tackle this dilemma, Federated MultI-task Distillation for Multi-access Edge CompuTing (FedICT) is proposed. FedICT direct local-global knowledge aloof during bi-directional distillation processes between clients and the server, aiming to enable multi-task clients while alleviating client drift derived from divergent optimization directions of client-side local models. Specifically, FedICT includes Federated Prior Knowledge Distillation (FPKD) and Local Knowledge Adjustment (LKA). FPKD is proposed to reinforce the clients' fitting of local data by introducing prior knowledge of local data distributions. Moreover, LKA is proposed to correct the distillation loss of the server, making the transferred local knowledge better match the generalized representation. Experiments on three datasets show that FedICT significantly outperforms all compared benchmarks in various data heterogeneous and model architecture settings, achieving improved accuracy with less than 1.2% training communication overhead compared with FedAvg and no more than 75% training communication round compared with FedGKT.

translated by 谷歌翻译

Resource-Aware Heterogeneous Federated Learning using Neural Architecture Search

Sixing Yu , Phuong Nguyen , Waqwoya Abebe , Justin Stanley , Pablo Munoz , Ali Jannesari

分类：机器学习 | 计算机视觉

2022-11-09

Federated Learning (FL) is extensively used to train AI/ML models in distributed and privacy-preserving settings. Participant edge devices in FL systems typically contain non-independent and identically distributed~(Non-IID) private data and unevenly distributed computational resources. Preserving user data privacy while optimizing AI/ML models in a heterogeneous federated network requires us to address data heterogeneity and system/resource heterogeneity. Hence, we propose \underline{R}esource-\underline{a}ware \underline{F}ederated \underline{L}earning~(RaFL) to address these challenges. RaFL allocates resource-aware models to edge devices using Neural Architecture Search~(NAS) and allows heterogeneous model architecture deployment by knowledge extraction and fusion. Integrating NAS into FL enables on-demand customized model deployment for resource-diverse edge devices. Furthermore, we propose a multi-model architecture fusion scheme allowing the aggregation of the distributed learning results. Results demonstrate RaFL's superior resource efficiency compared to SoTA.

translated by 谷歌翻译

Handling Data Heterogeneity in Federated Learning via Knowledge Fusion

Xu Zhou , Xinyu Lei , Cong Yang , Yichun Shi , Xiao Zhang , Jingwen Shi

分类：机器学习

2022-07-23

联合学习（FL）在中央服务器的帮助下支持多个客户的全球机器学习模型的分布式培训。每个客户端持有的本地数据集从未在FL中交换，因此保护本地数据集隐私受到保护。尽管FL越来越流行，但不同客户的数据异质性导致客户模型漂移问题，并导致模型性能降级和模型公平不佳。为了解决这个问题，我们在本文中使用全球本地知识融合（FEDKF）计划设计联合学习。 FEDKF中的关键思想是让服务器返回每个训练回合中的全局知识，以与本地知识融合，以便可以将本地模型正规化为全球最佳选择。因此，可以缓解客户模型漂移问题。在FEDKF中，我们首先提出了支持精确的全球知识表示形式的主动模型聚合技术。然后，我们提出了一种无数据的知识蒸馏（KD）方法，以促进KD从全局模型到本地模型，而本地模型仍然可以同时学习本地知识（嵌入本地数据集中），从而实现了全局 - 本地知识融合过程。理论分析和密集实验表明，FEDKF同时实现高模型性能，高公平性和隐私性。纸质审查后，项目源代码将在GitHub上发布。

translated by 谷歌翻译

FedGEMS: Federated Learning of Larger Server Models via Selective Knowledge Fusion

Sijie Cheng , Jingwen Wu , Yanghua Xiao , Yang Liu , Yang Liu

分类：机器学习 | 人工智能

2021-10-21

今天的数据往往散布数十亿资源受限的边缘设备，具有安全性和隐私约束。联合学习（FL）已成为在保持数据私有的同时学习全球模型的可行解决方案，但FL的模型复杂性被边缘节点的计算资源阻碍。在这项工作中，我们调查了一种新的范例来利用强大的服务器模型来突破FL中的模型容量。通过选择性地从多个教师客户和本身学习，服务器模型开发深入的知识，并将其知识传输回客户端，以恢复它们各自的性能。我们所提出的框架在服务器和客户端模型上实现了卓越的性能，并在统一的框架中提供了几个优势，包括异构客户端架构的灵活性，对各种图像分类任务的客户端和服务器之间的通信效率。

translated by 谷歌翻译

Scalable Collaborative Learning via Representation Sharing

Frédéric Berdoz , Abhishek Singh , Martin Jaggi , Ramesh Raskar

分类：机器学习 | 人工智能

2022-11-20

Privacy-preserving machine learning has become a key conundrum for multi-party artificial intelligence. Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device). In FL, each data holder trains a model locally and releases it to a central server for aggregation. In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation). While relevant in several settings, both of these schemes have a high communication cost, rely on server-level computation algorithms and do not allow for tunable levels of collaboration. In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss (contrastive w.r.t. the labels). The goal is to ensure that the participants learn similar features on similar classes without sharing their input data. To do so, each client releases averaged last hidden layer activations of similar labels to a central server that only acts as a relay (i.e., is not involved in the training or aggregation of the models). Then, the clients download these last layer activations (feature representations) of the ensemble of users and distill their knowledge in their personal model using a contrastive objective. For cross-device applications (i.e., small local datasets and limited computational capacity), this approach increases the utility of the models compared to independent learning and other federated knowledge distillation (FD) schemes, is communication efficient and is scalable with the number of clients. We prove theoretically that our framework is well-posed, and we benchmark its performance against standard FD and FL on various datasets using different model architectures.

translated by 谷歌翻译

FedMR: Fedreated Learning via Model Recombination

Ming Hu , Zhihao Yue , Zhiwei Ling , Xian Wei , Mingsong Chen

分类：机器学习

2022-08-16

作为一种有希望的隐私机器学习方法，联合学习（FL）可以使客户跨客户培训，而不会损害其机密的本地数据。但是，现有的FL方法遇到了不均分布数据的推理性能低的问题，因为它们中的大多数依赖于联合平均（FIDAVG）基于联合的聚合。通过以粗略的方式平均模型参数，FedAvg将局部模型的个体特征黯然失色，这极大地限制了FL的推理能力。更糟糕的是，在每一轮FL培训中，FedAvg向客户端向客户派遣了相同的初始本地模型，这很容易导致对最佳全局模型的局限性搜索。为了解决上述问题，本文提出了一种新颖有效的FL范式，名为FEDMR（联合模型重组）。与传统的基于FedAvg的方法不同，FEDMR的云服务器将收集到的本地型号的每一层层混合，并重组它们以实现新的模型，以供客户端培训。由于在每场FL比赛中进行了细粒度的模型重组和本地培训，FEDMR可以迅速为所有客户找出一个全球最佳模型。全面的实验结果表明，与最先进的FL方法相比，FEDMR可以显着提高推理准确性而不会引起额外的通信开销。

translated by 谷歌翻译

Cross-domain Federated Object Detection

Shangchao Su , Bin Li , Chengzhi Zhang , Mingzhao Yang , Xiangyang Xue

分类：计算机视觉

2022-06-30

一方（服务器）培训的检测模型可能会在分发给其他用户（客户）时面临严重的性能降解。例如，在自主驾驶场景中，不同的驾驶环境可能会带来明显的域移动，从而导致模型预测的偏见。近年来出现的联合学习可以使多方合作培训无需泄漏客户数据。在本文中，我们专注于特殊的跨域场景，其中服务器包含大规模数据，并且多个客户端仅包含少量数据。同时，客户之间的数据分布存在差异。在这种情况下，传统的联合学习技术不能考虑到所有参与者的全球知识和特定客户的个性化知识的学习。为了弥补这一限制，我们提出了一个跨域联合对象检测框架，名为FedOD。为了同时学习不同领域的全球知识和个性化知识，拟议的框架首先执行联合培训，以通过多教老师蒸馏获得公共全球汇总模型，并将汇总模型发送给每个客户端以供应其个性化的个性化模型本地模型。经过几轮沟通后，在每个客户端，我们可以对公共全球模型和个性化本地模型进行加权合奏推理。通过合奏，客户端模型的概括性能可以胜过具有相同参数量表的单个模型。我们建立了一个联合对象检测数据集，该数据集具有基于多个公共自主驾驶数据集的显着背景差异和实例差异，然后在数据集上进行大量实验。实验结果验证了所提出的方法的有效性。

translated by 谷歌翻译

Preservation of the Global Knowledge by Not-True Distillation in Federated Learning

Gihun Lee , Minchan Jeong , Yongjin Shin , Sangmin Bae , Se-Young Yun

分类：机器学习 | 人工智能 | 计算机视觉

2021-06-06

In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models. Although this precludes the need to access clients' data directly, the global model's convergence often suffers from data heterogeneity. This study starts from an analogy to continual learning and suggests that forgetting could be the bottleneck of federated learning. We observe that the global model forgets the knowledge from previous rounds, and the local training induces forgetting the knowledge outside of the local distribution. Based on our findings, we hypothesize that tackling down forgetting will relieve the data heterogeneity problem. To this end, we propose a novel and effective algorithm, Federated Not-True Distillation (FedNTD), which preserves the global perspective on locally available data only for the not-true classes. In the experiments, FedNTD shows state-of-the-art performance on various setups without compromising data privacy or incurring additional communication costs.

translated by 谷歌翻译

A Practical Data-Free Approach to One-shot Federated Learning with Heterogeneity

Jie Zhang , Chen Chen , Bo Li , Lingjuan Lyu , Shuang Wu , Jianghe Xu , Shouhong Ding , Chao Wu

分类：机器学习 | 计算机视觉

2021-12-23

一滴联合学习（FL）最近被出现为有希望的方法，允许中央服务器在单个通信中学习模型。尽管通信成本低，但现有的一次性的单次方法大多是不切实际或面临的固有限制，例如，需要公共数据集，客户的型号是同质的，需要上传其他数据/型号信息。为了克服这些问题，我们提出了一种更实用的无数据方法，名为FEDSYN的一枪框架，具有异质性。我们的Fedsyn通过数据生成阶段和模型蒸馏阶段列出全球模型。据我们所知，FEDSYN是由于以下优点，FEDSYN可以实际应用于各种实际应用程序的方法：（1）FEDSYN不需要在客户端之间传输的其他信息（模型参数除外）服务器; （2）FEDSYN不需要任何用于培训的辅助数据集; （3）FEDSYN是第一个考虑FL中的模型和统计异质性，即客户的数据是非IID，不同的客户端可能具有不同的模型架构。关于各种现实世界数据集的实验表明了我们的Fedsyn的优越性。例如，当数据是非IID时，FEDSYN在CIFAR10数据集中优于CEFAR10数据集的最佳基线方法FED-ADI的最佳基准方法。

translated by 谷歌翻译

Towards Federated Learning against Noisy Labels via Local Self-Regularization

Xuefeng Jiang , Sheng Sun , Yuwei Wang , Min Liu

分类：机器学习 | 人工智能

2022-08-25

联邦学习（FL）旨在以隐私的方式从大规模的分散设备中学习联合知识。但是，由于高质量标记的数据需要昂贵的人类智能和努力，因此带有错误标签的数据（称为嘈杂标签）无处不在，实际上不可避免地会导致性能退化。尽管提出了许多直接处理嘈杂标签的方法，但这些方法要么需要过多的计算开销，要么违反FL的隐私保护原则。为此，我们将重点放在FL上，目的是减轻嘈杂标签所产生的性能退化，同时保证数据隐私。具体而言，我们提出了一种局部自我调节方法，该方法通过隐式阻碍模型记忆噪声标签并明确地缩小了使用自我蒸馏之间的原始实例和增强实例之间的模型输出差异，从而有效地规范了局部训练过程。实验结果表明，我们提出的方法可以在三个基准数据集上的各种噪声水平中获得明显的抵抗力。此外，我们将方法与现有的最新方法集成在一起，并在实际数据集服装1M上实现卓越的性能。该代码可在https://github.com/sprinter1999/fedlsr上找到。

translated by 谷歌翻译

FedRAD: Federated Robust Adaptive Distillation

Stefán Páll Sturluson , Samuel Trew , Luis Muñoz-González , Matei Grama , Jonathan Passerat-Palmbach , Daniel Rueckert , Amir Alansary

分类：机器学习 | 人工智能

2021-12-02

联邦学习（FL）的稳健性对于分布式培训的准确全球模型的分布式培训至关重要。通过典型聚合模型更新的协作学习框架容易受到来自对抗客户的中毒攻击。由于全局服务器和参与者之间的共享信息仅限于模型参数，因此检测错误的模型更新是挑战性的。此外，现实世界数据集通常在参与者中异质且不独立，并且不独立，并且在非IID中分布（非IID），这使得这种稳健的流水线更加困难。在这项工作中，我们提出了一种新颖的鲁棒聚集方法，联邦鲁棒自适应蒸馏（Fedrad），以检测基于中值统计的属性的对手和鲁棒地聚合本地模型，然后执行适应的集合知识蒸馏。我们运行广泛的实验，以评估拟议的方法对最近公布的作品。结果表明，FEDRAD在存在对手的情况下表现出所有其他聚合器，以及异构数据分布。

translated by 谷歌翻译

Label driven Knowledge Distillation for Federated Learning with non-IID Data

Minh-Duong Nguyen , Quoc-Viet Pham , Dinh Thai Hoang , Long Tran-Thanh , Diep N. Nguyen , Won-Joo Hwang

分类：机器学习 | 人工智能

2022-09-29

在现实世界应用中，联合学习（FL）遇到了两个挑战：（1）可伸缩性，尤其是应用于大型物联网网络时；（2）如何使用异质数据对环境进行健全。意识到第一个问题，我们旨在设计一个名为Full-Stack FL（F2L）的新型FL框架。更具体地说，F2L使用层次结构架构，使扩展FL网络可以访问而无需重建整个网络系统。此外，利用层次网络设计的优势，我们在全球服务器上提出了一种新的标签驱动知识蒸馏（LKD）技术来解决第二个问题。与当前的知识蒸馏技术相反，LKD能够训练学生模型，该模型由所有教师模型的良好知识组成。因此，我们提出的算法可以有效地提取区域数据分布（即区域汇总模型）的知识，以减少客户在使用非独立分布数据的FL系统下操作时客户模型之间的差异。广泛的实验结果表明：（i）我们的F2L方法可以显着提高所有全球蒸馏的总体FL效率，并且（ii）F2L随着全球蒸馏阶段的发生而迅速达到收敛性，而不是在每个通信周期中提高。

translated by 谷歌翻译

Federated Learning for Non-IID Data via Client Variance Reduction and Adaptive Server Update

Hiep Nguyen , Lam Phan , Harikrishna Warrier , Yogesh Gupta

分类：机器学习

2022-07-18

联合学习（FL）是一种新兴技术，用于协作训练全球机器学习模型，同时将数据局限于用户设备。FL实施实施的主要障碍是用户之间的非独立且相同的（非IID）数据分布，这会减慢收敛性和降低性能。为了解决这个基本问题，我们提出了一种方法（comfed），以增强客户端和服务器侧的整个培训过程。舒适的关键思想是同时利用客户端变量减少技术来促进服务器聚合和全局自适应更新技术以加速学习。我们在CIFAR-10分类任务上的实验表明，Comfed可以改善专用于非IID数据的最新算法。

translated by 谷歌翻译

Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning

Matias Mendieta , Taojiannan Yang , Pu Wang , Minwoo Lee , Zhengming Ding , Chen Chen

分类：机器学习 | 计算机视觉

2021-11-28

联合学习（FL）是一个有希望的策略，用于使用客户端（即边缘设备）的网络进行隐私保留，分布式学习。然而，客户之间的数据分布通常是非IID的，使得有效优化困难。为了缓解这个问题，许多流行算法专注于通过引入各种近似术语，一些产生可观的计算和/或内存开销来减轻客户端跨客户端的影响，以限制关于全局模型的本地更新。相反，我们考虑重新思考的解决方案，以重点关注局部学习一般性而不是近端限制。为此，我们首先提出了一项系统的研究，通过二阶指标通知，更好地了解FL中的算法效果。有趣的是，我们发现标准的正则化方法令人惊讶的是减轻数据异质性效应的强烈表现者。根据我们的调查结果，我们进一步提出了一种简单有效的方法，努力克服数据异质性和先前方法的陷阱。 FedAlign在各种设置中使用最先进的FL方法实现了竞争准确性，同时最大限度地减少计算和内存开销。代码将公开。

translated by 谷歌翻译