智能论文笔记

Transferability-Guided Cross-Domain Cross-Task Transfer Learning

Yang Tan , Yang Li , Shao-Lun Huang , Xiao-Ping Zhang

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-12

我们提出了两个新颖的可传递性指标F-OTCE（基于快速最佳运输的条件熵）和JC-otce（联合通信OTCE），以评估源模型（任务）可以使目标任务的学习受益多少，并学习更可转移的表示形式。用于跨域交叉任务转移学习。与需要评估辅助任务的经验可转让性的现有指标不同，我们的指标是无辅助的，以便可以更有效地计算它们。具体而言，F-otce通过首先求解源和目标分布之间的最佳传输（OT）问题来估计可转移性，然后使用最佳耦合来计算源和目标标签之间的负条件熵。它还可以用作损失函数，以最大化目标任务填充源模型的可传递性。同时，JC-OTCE通过在OT问题中包含标签距离来提高F-otce的可转移性鲁棒性，尽管它可能会产生额外的计算成本。广泛的实验表明，F-otce和JC-otce优于最先进的无辅助指标，分别为18.85％和28.88％，与基础真相转移精度相关系数。通过消除辅助任务的训练成本，两个指标将前一个方法的总计算时间从43分钟减少到9.32s和10.78，用于一对任务。当用作损失函数时，F-otce在几个射击分类实验中显示出源模型的传输精度的一致性提高，精度增益高达4.41％。

translated by 谷歌翻译

Transferability Estimation using Bhattacharyya Class Separability

Michal Pándy , Andrea Agostinelli , Jasper Uijlings , Vittorio Ferrari , Thomas Mensink

分类：计算机视觉

2021-11-24

转移学习已成为利用计算机视觉中预先训练模型的流行方法。然而，在不执行计算上昂贵的微调的情况下，难以量化哪个预先训练的源模型适用于特定目标任务，或者相反地，可以容易地适应预先训练的源模型的任务。在这项工作中，我们提出了高斯Bhattacharyya系数（GBC），一种用于量化源模型和目标数据集之间的可转换性的新方法。在第一步中，我们在由源模型定义的特征空间中嵌入所有目标图像，并表示使用每类高斯。然后，我们使用Bhattacharyya系数估计它们的成对类可分离性，从而产生了一种简单有效的源模型转移到目标任务的程度。我们在数据集和架构选择的上下文中评估GBC在图像分类任务上。此外，我们还对更复杂的语义分割转移性估算任务进行实验。我们证明GBC在语义分割设置中大多数评估标准上的最先进的可转移性度量，匹配图像分类中的数据集转移性的最高方法的性能，并且在图像分类中执行最佳的架构选择问题。

translated by 谷歌翻译

Wasserstein Distance Guided Representation Learning for Domain Adaptation

Jian Shen , Yanru Qu , Weinan Zhang , Yong Yu

分类：

2017-07-05

Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WD-GRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.

translated by 谷歌翻译

Generalizing to Unseen Domains with Wasserstein Distributional Robustness under Limited Source Knowledge

Jingge Wang , Liyan Xie , Yao Xie , Shao-Lun Huang , Yang Li

分类：机器学习 | 计算机视觉

2022-07-11

域的概括旨在学习一个通用模型，该模型在看不见的目标域上表现良好，并结合了来自多个源域的知识。在这项研究中，我们考虑了以下场景，在不同类别跨领域的条件分布之间发生不同的领域变化。当源域中的标记样品受到限制时，现有方法不足以鲁棒。为了解决这个问题，我们提出了一个新型的域泛化框架，称为Wasserstein分布在鲁棒域的概括（WDRDG），灵感来自分布稳健优化的概念。我们鼓励对特定于类的Wasserstein不确定性集中有条件分布的鲁棒性，并优化分类器在这些不确定性集上的最差性能。我们进一步开发了一个测试时间适应模块，利用最佳运输来量化未见目标域和源域之间的关系，以使目标数据适应性推断。旋转MNIST，PACS和VLCS数据集的实验表明，我们的方法可以有效地平衡挑战性概括场景中的鲁棒性和可区分性。

translated by 谷歌翻译

Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning

Yang Shu , Zhangjie Cao , Jinghan Gao , Jianmin Wang , Philip S. Yu , Mingsheng Long

分类：机器学习

2021-10-14

Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focuses respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.

translated by 谷歌翻译

Revisiting Deep Subspace Alignment for Unsupervised Domain Adaptation

Kowshik Thopalli , Jayaraman J Thiagarajan , Rushil Anirudh , Pavan K Turaga

分类：机器学习 | 计算机视觉

2022-01-05

无监督域适应（UDA）旨在将知识从标记的源域传输到未标记的目标域。传统上，基于子空间的方法为此问题形成了一类重要的解决方案。尽管他们的数学优雅和易腐烂性，但这些方法通常被发现在产生具有复杂的现实世界数据集的领域不变的功能时无效。由于近期具有深度网络的代表学习的最新进展，本文重新访问了UDA的子空间对齐，提出了一种新的适应算法，始终如一地导致改进的泛化。与现有的基于对抗培训的DA方法相比，我们的方法隔离了特征学习和分配对准步骤，并利用主要辅助优化策略来有效地平衡域不契约的目标和模型保真度。在提供目标数据和计算要求的显着降低的同时，基于子空间的DA竞争性，有时甚至优于几种标准UDA基准测试的最先进的方法。此外，子空间对准导致本质上定期的模型，即使在具有挑战性的部分DA设置中，也表现出强大的泛化。最后，我们的UDA框架的设计本身支持对测试时间的新目标域的逐步适应，而无需从头开始重新检测模型。总之，由强大的特征学习者和有效的优化策略提供支持，我们将基于子空间的DA建立为可视识别的高效方法。

translated by 谷歌翻译

Not All Instances Contribute Equally: Instance-adaptive Class Representation Learning for Few-Shot Visual Recognition

Mengya Han , Yibing Zhan , Yong Luo , Bo Du , Han Hu , Yonggang Wen , Dacheng Tao

分类：计算机视觉

2022-09-07

很少有视觉识别是指从一些标记实例中识别新颖的视觉概念。通过将查询表示形式与类表征进行比较以预测查询实例的类别，许多少数射击的视觉识别方法采用了基于公制的元学习范式。但是，当前基于度量的方法通常平等地对待所有实例，因此通常会获得有偏见的类表示，考虑到并非所有实例在总结了类级表示的实例级表示时都同样重要。例如，某些实例可能包含无代表性的信息，例如过多的背景和无关概念的信息，这使结果偏差。为了解决上述问题，我们提出了一个新型的基于公制的元学习框架，称为实例自适应类别表示网络（ICRL-net），以进行几次视觉识别。具体而言，我们开发了一个自适应实例重新平衡网络，具有在生成班级表示，通过学习和分配自适应权重的不同实例中的自适应权重时，根据其在相应类的支持集中的相对意义来解决偏见的表示问题。此外，我们设计了改进的双线性实例表示，并结合了两个新型的结构损失，即，阶层内实例聚类损失和阶层间表示区分损失，以进一步调节实例重估过程并完善类表示。我们对四个通常采用的几个基准测试：Miniimagenet，Tieredimagenet，Cifar-FS和FC100数据集进行了广泛的实验。与最先进的方法相比，实验结果证明了我们的ICRL-NET的优势。

translated by 谷歌翻译

Optimal transport meets noisy label robust loss and MixUp regularization for domain adaptation

Kilian Fatras , Hiroki Naganuma , Ioannis Mitliagkas

分类：计算机视觉 | 机器学习 | (统计)机器学习

2022-06-22

在计算机视觉中，面对域转移是很常见的：具有相同类但采集条件不同的图像。在域适应性（DA）中，人们希望使用源标记的图像对未标记的目标图像进行分类。不幸的是，在源训练集中训练的深度神经网络在不属于训练领域的目标图像上表现不佳。改善这些性能的一种策略是使用最佳传输（OT）在嵌入式空间中对齐源和目标图像分布。但是，OT会导致负转移，即与不同标签的样品对齐，这导致过度拟合，尤其是在域之间存在标签移动的情况下。在这项工作中，我们通过将其解释为针对目标图像的嘈杂标签分配来减轻负相位。然后，我们通过适当的正则化来减轻其效果。我们建议将混合正则化\ citep {zhang2018mixup}与噪音标签强大的损失，以提高域的适应性性能。我们在一项广泛的消融研究中表明，这两种技术的结合对于提高性能至关重要。最后，我们在几个基准和现实世界DA问题上评估了称为\ textsc {mixunbot}的方法。

translated by 谷歌翻译

Applications of Unsupervised Deep Transfer Learning to Intelligent Fault Diagnosis: A Survey and Comparative Study

Zhibin Zhao , Qiyang Zhang , Xiaolei Yu , Chuang Sun , Shibin Wang , Ruqiang Yan , Xuefeng Chen

分类：机器学习

2019-12-28

最近的智能故障诊断（IFD）的进展大大依赖于深度代表学习和大量标记数据。然而，机器通常以各种工作条件操作，或者目标任务具有不同的分布，其中包含用于训练的收集数据（域移位问题）。此外，目标域中的新收集的测试数据通常是未标记的，导致基于无监督的深度转移学习（基于UDTL为基础的）IFD问题。虽然它已经实现了巨大的发展，但标准和开放的源代码框架以及基于UDTL的IFD的比较研究尚未建立。在本文中，我们根据不同的任务，构建新的分类系统并对基于UDTL的IFD进行全面审查。对一些典型方法和数据集的比较分析显示了基于UDTL的IFD中的一些开放和基本问题，这很少研究，包括特征，骨干，负转移，物理前导等的可转移性，强调UDTL的重要性和再现性 - 基于IFD，整个测试框架将发布给研究界以促进未来的研究。总之，发布的框架和比较研究可以作为扩展界面和基本结果，以便对基于UDTL的IFD进行新的研究。代码框架可用于\ url {https:/github.com/zhaozhibin/udtl}。

translated by 谷歌翻译

A Comprehensive Survey on Transfer Learning

Fuzhen Zhuang , Zhiyuan Qi , Keyu Duan , Dongbo Xi , Yongchun Zhu , Hengshu Zhu , Hui Xiong , Qing He

分类：

2019-11-07

Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. Due to the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning researches, as well as to summarize and interpret the mechanisms and the strategies of transfer learning in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Unlike previous surveys, this survey paper reviews more than forty representative transfer learning approaches, especially homogeneous transfer learning approaches, from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, over twenty representative transfer learning models are used for experiments. The models are performed on three different datasets, i.e., Amazon Reviews, Reuters-21578, and Office-31. And the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.

translated by 谷歌翻译

Finding the Most Transferable Tasks for Brain Image Segmentation

Yicong Li , Yang Tan , Jingyun Yang , Yang Li , Xiao-Ping Zhang

分类：人工智能 | 计算机视觉 | 机器学习

2023-01-03

Although many studies have successfully applied transfer learning to medical image segmentation, very few of them have investigated the selection strategy when multiple source tasks are available for transfer. In this paper, we propose a prior knowledge guided and transferability based framework to select the best source tasks among a collection of brain image segmentation tasks, to improve the transfer learning performance on the given target task. The framework consists of modality analysis, RoI (region of interest) analysis, and transferability estimation, such that the source task selection can be refined step by step. Specifically, we adapt the state-of-the-art analytical transferability estimation metrics to medical image segmentation tasks and further show that their performance can be significantly boosted by filtering candidate source tasks based on modality and RoI characteristics. Our experiments on brain matter, brain tumor, and white matter hyperintensities segmentation datasets reveal that transferring from different tasks under the same modality is often more successful than transferring from the same task under different modalities. Furthermore, within the same modality, transferring from the source task that has stronger RoI shape similarity with the target task can significantly improve the final transfer performance. And such similarity can be captured using the Structural Similarity index in the label space.

translated by 谷歌翻译

Wasserstein Task Embedding for Measuring Task Similarities

Xinran Liu , Yikun Bai , Yuzhe Lu , Andrea Soltoggio , Soheil Kolouri

分类：机器学习

2022-08-24

在各种机器学习问题中，包括转移，多任务，连续和元学习在内，衡量不同任务之间的相似性至关重要。最新的测量任务相似性的方法依赖于体系结构：1）依靠预训练的模型，或2）在任务上进行培训网络，并将正向转移用作任务相似性的代理。在本文中，我们利用了最佳运输理论，并定义了一个新颖的任务嵌入监督分类，该分类是模型的，无训练的，并且能够处理（部分）脱节标签集。简而言之，给定带有地面标签的数据集，我们通过多维缩放和串联数据集样品进行嵌入标签，并具有相应的标签嵌入。然后，我们将两个数据集之间的距离定义为其更新样品之间的2-Wasserstein距离。最后，我们利用2-wasserstein嵌入框架将任务嵌入到矢量空间中，在该空间中，嵌入点之间的欧几里得距离近似于任务之间提出的2-wasserstein距离。我们表明，与最佳传输数据集距离（OTDD）等相关方法相比，所提出的嵌入导致任务的比较显着更快。此外，我们通过各种数值实验证明了我们提出的嵌入的有效性，并显示了我们所提出的距离与任务之间的前进和向后转移之间的统计学意义相关性。

translated by 谷歌翻译

HTML版本

Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer

Jian Liang , Dapeng Hu , Yunbo Wang , Ran He , Jiashi Feng

分类：计算机视觉 | 机器学习

2020-12-14

无监督域适应（UDA）旨在将知识从相关但不同的良好标记的源域转移到新的未标记的目标域。大多数现有的UDA方法需要访问源数据，因此当数据保密而不相配在隐私问题时，不适用。本文旨在仅使用培训的分类模型来解决现实设置，而不是访问源数据。为了有效地利用适应源模型，我们提出了一种新颖的方法，称为源假设转移（拍摄），其通过将目标数据特征拟合到冻结源分类模块（表示分类假设）来学习目标域的特征提取模块。具体而言，拍摄挖掘出于特征提取模块的信息最大化和自我监督学习，以确保目标特征通过同一假设与看不见的源数据的特征隐式对齐。此外，我们提出了一种新的标签转移策略，它基于预测的置信度（标签信息），然后采用半监督学习来将目标数据分成两个分裂，然后提高目标域中的较为自信预测的准确性。如果通过拍摄获得预测，我们表示标记转移为拍摄++。关于两位数分类和对象识别任务的广泛实验表明，拍摄和射击++实现了与最先进的结果超越或相当的结果，展示了我们对各种视域适应问题的方法的有效性。代码可用于\ url {https：//github.com/tim-learn/shot-plus}。

translated by 谷歌翻译

A Survey of Unsupervised Domain Adaptation for Visual Recognition

Youshan Zhang

分类：计算机视觉

2021-12-13

虽然在许多域内生成并提供了大量的未标记数据，但对视觉数据的自动理解的需求高于以往任何时候。大多数现有机器学习模型通常依赖于大量标记的训练数据来实现高性能。不幸的是，在现实世界的应用中，不能满足这种要求。标签的数量有限，手动注释数据昂贵且耗时。通常需要将知识从现有标记域传输到新域。但是，模型性能因域之间的差异（域移位或数据集偏差）而劣化。为了克服注释的负担，域适应（DA）旨在在将知识从一个域转移到另一个类似但不同的域中时减轻域移位问题。无监督的DA（UDA）处理标记的源域和未标记的目标域。 UDA的主要目标是减少标记的源数据和未标记的目标数据之间的域差异，并在培训期间在两个域中学习域不变的表示。在本文中，我们首先定义UDA问题。其次，我们从传统方法和基于深度学习的方法中概述了不同类别的UDA的最先进的方法。最后，我们收集常用的基准数据集和UDA最先进方法的报告结果对视觉识别问题。

translated by 谷歌翻译

Discriminative Radial Domain Adaptation

Zenan Huang , Jun Wen , Siheng Chen , Linchao Zhu , Nenggan Zheng

分类：机器学习 | 计算机视觉

2023-01-01

Domain adaptation methods reduce domain shift typically by learning domain-invariant features. Most existing methods are built on distribution matching, e.g., adversarial domain adaptation, which tends to corrupt feature discriminability. In this paper, we propose Discriminative Radial Domain Adaptation (DRDR) which bridges source and target domains via a shared radial structure. It's motivated by the observation that as the model is trained to be progressively discriminative, features of different categories expand outwards in different directions, forming a radial structure. We show that transferring such an inherently discriminative structure would enable to enhance feature transferability and discriminability simultaneously. Specifically, we represent each domain with a global anchor and each category a local anchor to form a radial structure and reduce domain shift via structure matching. It consists of two parts, namely isometric transformation to align the structure globally and local refinement to match each category. To enhance the discriminability of the structure, we further encourage samples to cluster close to the corresponding local anchors based on optimal-transport assignment. Extensively experimenting on multiple benchmarks, our method is shown to consistently outperforms state-of-the-art approaches on varied tasks, including the typical unsupervised domain adaptation, multi-source domain adaptation, domain-agnostic learning, and domain generalization.

translated by 谷歌翻译

Frustratingly Easy Transferability Estimation

Long-Kai Huang , Ying Wei , Yu Rong , Qiang Yang , Junzhou Huang

分类：机器学习

2021-06-17

可传递性估计是选择预训练模型和其中的层来转移学习，转移，以最大程度地提高目标任务上的性能并防止负转移的必不可少的工具。现有的估计算法要么需要对目标任务进行深入培训，要么在评估层之间的可传递性方面遇到困难。为此，我们提出了一种简单，高效且有效的可传递性度量，称为“超越”。通过单一传递目标任务的示例，越过可转移性作为在预训练模型及其标签提取的目标示例的特征之间的相互信息。我们通过诉诸于熵的有效替代方案来克服有效的共同信息估计的挑战。从特征表示的角度来看，所得的越来越多地评估了完整性（功能是否包含目标任务的足够信息）和紧凑性（每个类的特征是否足够紧凑，以实现良好的概括）。从理论上讲，我们已经分析了转移学习后的跨度与性能的紧密联系。尽管在10行代码中具有非凡的简单性，但在对32个预训练模型和16个下游任务的广泛评估中，越来越多地表现出色。

translated by 谷歌翻译

Semi-supervised Domain Adaptive Structure Learning

Can Qin , Lichen Wang , Qianqian Ma , Yu Yin , Huan Wang , Yun Fu

分类：计算机视觉 | 人工智能 | 机器学习

2021-12-12

半监督域适应（SSDA）是一种具有挑战性的问题，需要克服1）以朝向域的较差的数据和2）分布换档的方法。不幸的是，由于培训数据偏差朝标标样本训练，域适应（DA）和半监督学习（SSL）方法的简单组合通常无法解决这两个目的。在本文中，我们介绍了一种自适应结构学习方法，以规范SSL和DA的合作。灵感来自多视图学习，我们建议的框架由共享特征编码器网络和两个分类器网络组成，用于涉及矛盾的目的。其中，其中一个分类器被应用于组目标特征以提高级别的密度，扩大了鲁棒代表学习的分类集群的间隙。同时，其他分类器作为符号器，试图散射源功能以增强决策边界的平滑度。目标聚类和源扩展的迭代使目标特征成为相应源点的扩张边界内的封闭良好。对于跨域特征对齐和部分标记的数据学习的联合地址，我们应用最大平均差异（MMD）距离最小化和自培训（ST）将矛盾结构投影成共享视图以进行可靠的最终决定。对标准SSDA基准的实验结果包括Domainnet和Office-Home，展示了我们对最先进的方法的方法的准确性和稳健性。

translated by 谷歌翻译

Self-supervised Autoregressive Domain Adaptation for Time Series Data

Mohamed Ragab , Emadeldeen Eldele , Zhenghua Chen , Min Wu , Chee-Keong Kwoh , Xiaoli Li

分类：机器学习

2021-11-29

无监督域适应（UDA）已成功解决了可视应用程序的域移位问题。然而，由于以下原因，这些方法可能对时间序列数据的性能有限。首先，它们主要依赖于用于源预制的大规模数据集（即，ImageNet），这不适用于时间序列数据。其次，它们在域对齐步骤期间忽略源极限和目标域的特征空间上的时间维度。最后，最先前的UDA方法中的大多数只能对齐全局特征而不考虑目标域的细粒度分布。为了解决这些限制，我们提出了一个自我监督的自回归域适应（Slarda）框架。特别是，我们首先设计一个自我监督的学习模块，它利用预测作为辅助任务以提高源特征的可转换性。其次，我们提出了一种新的自回归域自适应技术，其包括在域对齐期间源和目标特征的时间依赖性。最后，我们开发了一个集合教师模型，通过自信的伪标记方法对准目标域中的类明智分发。已经在三个现实世界时间序列应用中进行了广泛的实验，具有30个跨域方案。结果表明，我们所提出的杆状方法明显优于时序序列域适应的最先进的方法。

translated by 谷歌翻译

Domain-adversarial training of neural networks

分类：

We introduce a new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions. Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains.The approach implements this idea in the context of neural network architectures that are trained on labeled data from the source domain and unlabeled data from the target domain (no labeled target-domain data is necessary). As the training progresses, the approach promotes the emergence of features that are (i) discriminative for the main learning task on the source domain and (ii) indiscriminate with respect to the shift between the domains. We show that this adaptation behaviour can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer. The resulting augmented architecture can be trained using standard backpropagation and stochastic gradient descent, and can thus be implemented with little effort using any of the deep learning packages.We demonstrate the success of our approach for two distinct classification problems (document sentiment analysis and image classification), where state-of-the-art domain adaptation performance on standard benchmarks is achieved. We also validate the approach for descriptor learning task in the context of person re-identification application.

translated by 谷歌翻译

Transferability Estimation Based On Principal Gradient Expectation

Huiyan Qi , Lechao Cheng , Jingjing Chen , Yue Yu , Zunlei Feng , Yu-Gang Jiang

分类：计算机视觉

2022-11-29

Deep transfer learning has been widely used for knowledge transmission in recent years. The standard approach of pre-training and subsequently fine-tuning, or linear probing, has shown itself to be effective in many down-stream tasks. Therefore, a challenging and ongoing question arises: how to quantify cross-task transferability that is compatible with transferred results while keeping self-consistency? Existing transferability metrics are estimated on the particular model by conversing source and target tasks. They must be recalculated with all existing source tasks whenever a novel unknown target task is encountered, which is extremely computationally expensive. In this work, we highlight what properties should be satisfied and evaluate existing metrics in light of these characteristics. Building upon this, we propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks. Specifically, we use a restart scheme to calculate every batch gradient over each weight unit more than once, and then we take the average of all the gradients to get the expectation. Thus, the transferability between the source and target task is estimated by computing the distance of normalized principal gradients. Extensive experiments show that the proposed transferability metric is more stable, reliable and efficient than SOTA methods.

translated by 谷歌翻译