归纳转移学习旨在通过利用源任务中的预训练模型来从少量培训数据中学习目标任务。大多数涉及大规模深度学习模型的策略采用预先培训的模型和进行目标任务进行初始化。但是,当使用过度参数化模型时,我们通常可以在不牺牲源任务的准确性的情况下修剪模型。这促使我们采用模型修剪来通过深度学习模型进行转移学习。在本文中,我们提出了PAC-NET,这是一种简单而有效的方法,用于基于修剪的转移学习。 PAC-NET由三个步骤组成:修剪,分配和校准(PAC)。这些步骤背后的主要思想是确定源任务的基本权重,通过更新基本权重来微调源任务,然后通过更新剩余的冗余权重来校准目标任务。在各种广泛的感应转移学习实验集中,我们表明我们的方法通过很大的边距实现了最先进的性能。
translated by 谷歌翻译
Deep domain adaptation has emerged as a new learning technique to address the lack of massive amounts of labeled data. Compared to conventional methods, which learn shared feature subspaces or reuse important source instances with shallow representations, deep domain adaptation methods leverage deep networks to learn more transferable representations by embedding domain adaptation in the pipeline of deep learning. There have been comprehensive surveys for shallow domain adaptation, but few timely reviews the emerging deep learning based methods. In this paper, we provide a comprehensive survey of deep domain adaptation methods for computer vision applications with four major contributions. First, we present a taxonomy of different deep domain adaptation scenarios according to the properties of data that define how two domains are diverged. Second, we summarize deep domain adaptation approaches into several categories based on training loss, and analyze and compare briefly the state-of-the-art methods under these categories. Third, we overview the computer vision applications that go beyond image classification, such as face recognition, semantic segmentation and object detection. Fourth, some potential deficiencies of current methods and several future directions are highlighted.
translated by 谷歌翻译
物理信息的神经网络(PINN)是神经网络(NNS),它们作为神经网络本身的组成部分编码模型方程,例如部分微分方程(PDE)。如今,PINN是用于求解PDE,分数方程,积分分化方程和随机PDE的。这种新颖的方法已成为一个多任务学习框架,在该框架中,NN必须在减少PDE残差的同时拟合观察到的数据。本文对PINNS的文献进行了全面的综述:虽然该研究的主要目标是表征这些网络及其相关的优势和缺点。该综述还试图将出版物纳入更广泛的基于搭配的物理知识的神经网络,这些神经网络构成了香草·皮恩(Vanilla Pinn)以及许多其他变体,例如物理受限的神经网络(PCNN),各种HP-VPINN,变量HP-VPINN,VPINN,VPINN,变体。和保守的Pinn(CPINN)。该研究表明,大多数研究都集中在通过不同的激活功能,梯度优化技术,神经网络结构和损耗功能结构来定制PINN。尽管使用PINN的应用范围广泛,但通过证明其在某些情况下比有限元方法(FEM)等经典数值技术更可行的能力,但仍有可能的进步,最著名的是尚未解决的理论问题。
translated by 谷歌翻译
最近的智能故障诊断(IFD)的进展大大依赖于深度代表学习和大量标记数据。然而,机器通常以各种工作条件操作,或者目标任务具有不同的分布,其中包含用于训练的收集数据(域移位问题)。此外,目标域中的新收集的测试数据通常是未标记的,导致基于无监督的深度转移学习(基于UDTL为基础的)IFD问题。虽然它已经实现了巨大的发展,但标准和开放的源代码框架以及基于UDTL的IFD的比较研究尚未建立。在本文中,我们根据不同的任务,构建新的分类系统并对基于UDTL的IFD进行全面审查。对一些典型方法和数据集的比较分析显示了基于UDTL的IFD中的一些开放和基本问题,这很少研究,包括特征,骨干,负转移,物理前导等的可转移性,强调UDTL的重要性和再现性 - 基于IFD,整个测试框架将发布给研究界以促进未来的研究。总之,发布的框架和比较研究可以作为扩展界面和基本结果,以便对基于UDTL的IFD进行新的研究。代码框架可用于\ url {https:/github.com/zhaozhibin/udtl}。
translated by 谷歌翻译
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks. As promising surrogate solvers of partial differential equations (PDEs) for real-time prediction, deep neural operators such as deep operator networks (DeepONets) provide a new simulation paradigm in science and engineering. Pure data-driven neural operators and deep learning models, in general, are usually limited to interpolation scenarios, where new predictions utilize inputs within the support of the training set. However, in the inference stage of real-world applications, the input may lie outside the support, i.e., extrapolation is required, which may result to large errors and unavoidable failure of deep learning models. Here, we address this challenge of extrapolation for deep neural operators. First, we systematically investigate the extrapolation behavior of DeepONets by quantifying the extrapolation complexity via the 2-Wasserstein distance between two function spaces and propose a new behavior of bias-variance trade-off for extrapolation with respect to model capacity. Subsequently, we develop a complete workflow, including extrapolation determination, and we propose five reliable learning methods that guarantee a safe prediction under extrapolation by requiring additional information -- the governing PDEs of the system or sparse new observations. The proposed methods are based on either fine-tuning a pre-trained DeepONet or multifidelity learning. We demonstrate the effectiveness of the proposed framework for various types of parametric PDEs. Our systematic comparisons provide practical guidelines for selecting a proper extrapolation method depending on the available information, desired accuracy, and required inference speed.
translated by 谷歌翻译
Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. Due to the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning researches, as well as to summarize and interpret the mechanisms and the strategies of transfer learning in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Unlike previous surveys, this survey paper reviews more than forty representative transfer learning approaches, especially homogeneous transfer learning approaches, from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, over twenty representative transfer learning models are used for experiments. The models are performed on three different datasets, i.e., Amazon Reviews, Reuters-21578, and Office-31. And the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.
translated by 谷歌翻译
大多数现有的多源域适配(MSDA)方法通过特征分布对准最小化多个源 - 目标域对之间的距离,从单个源设置借用的方法。但是,对于不同的源极域,对齐成对特征分布是具有挑战性的,甚至可以对MSDA进行反效率。在本文中,我们介绍了一种新颖的方法:可转让的属性学习。动机很简单:虽然不同的域可以具有急剧不同的视野,但它们包含相同的类类,其特征在一起相同的属性;因此,MSDA模型应该专注于学习目标域的最可转换的属性。采用这种方法,我们提出了域名关注一致性网络,称为DAC网。关键设计是一个特征通道注意模块,旨在识别可转移功能(属性)。重要的是,注意模块受到一致性损失的监督,这对源极和目标域之间的信道注意权重的分布施加。此外,为了促进对目标数据的鉴别特征学习,我们将伪标记与类紧凑性丢失相结合,以最小化目标特征和分类器的权重向量之间的距离。在三个MSDA基准测试中进行了广泛的实验表明,我们的DAC-NET在所有这些中实现了新的最新性能。
translated by 谷歌翻译
Deep operator network (DeepONet) has demonstrated great success in various learning tasks, including learning solution operators of partial differential equations. In particular, it provides an efficient approach to predict the evolution equations in a finite time horizon. Nevertheless, the vanilla DeepONet suffers from the issue of stability degradation in the long-time prediction. This paper proposes a {\em transfer-learning} aided DeepONet to enhance the stability. Our idea is to use transfer learning to sequentially update the DeepONets as the surrogates for propagators learned in different time frames. The evolving DeepONets can better track the varying complexities of the evolution equations, while only need to be updated by efficient training of a tiny fraction of the operator networks. Through systematic experiments, we show that the proposed method not only improves the long-time accuracy of DeepONet while maintaining similar computational cost but also substantially reduces the sample size of the training set.
translated by 谷歌翻译
在过去的十年中,许多深入学习模型都受到了良好的培训,并在各种机器智能领域取得了巨大成功,特别是对于计算机视觉和自然语言处理。为了更好地利用这些训练有素的模型在域内或跨域转移学习情况下,提出了知识蒸馏(KD)和域适应(DA)并成为研究亮点。他们旨在通过原始培训数据从训练有素的模型转移有用的信息。但是,由于隐私,版权或机密性,原始数据并不总是可用的。最近,无数据知识转移范式吸引了吸引人的关注,因为它涉及从训练有素的模型中蒸馏宝贵的知识,而无需访问培训数据。特别是,它主要包括无数据知识蒸馏(DFKD)和源无数据域适应(SFDA)。一方面,DFKD旨在将域名域内知识从一个麻烦的教师网络转移到一个紧凑的学生网络,以进行模型压缩和有效推论。另一方面,SFDA的目标是重用存储在训练有素的源模型中的跨域知识并将其调整为目标域。在本文中,我们对知识蒸馏和无监督域适应的视角提供了全面的数据知识转移,以帮助读者更好地了解目前的研究状况和想法。分别简要审查了这两个领域的应用和挑战。此外,我们对未来研究的主题提供了一些见解。
translated by 谷歌翻译
采用基于数据的方法会导致许多石油和天然气记录数据处理问题的模型改进。由于深度学习提供的新功能,这些改进变得更加合理。但是,深度学习的使用仅限于研究人员拥有大量高质量数据的领域。我们提出了一种提供通用数据表示的方法,适用于针对不同油田的不同问题的解决方案,而少量数据。我们的方法依赖于从井的间隔内进行连续记录数据的自我监督方法,因此从一开始就不需要标记的数据。为了验证收到的表示形式,我们考虑分类和聚类问题。我们还考虑转移学习方案。我们发现,使用变异自动编码器会导致最可靠,最准确的模型。方法我们还发现,研究人员只需要一个针对目标油田的微小单独的数据集即可在通用表示之上解决特定问题。
translated by 谷歌翻译
In continual learning (CL), the goal is to design models that can learn a sequence of tasks without catastrophic forgetting. While there is a rich set of techniques for CL, relatively little understanding exists on how representations built by previous tasks benefit new tasks that are added to the network. To address this, we study the problem of continual representation learning (CRL) where we learn an evolving representation as new tasks arrive. Focusing on zero-forgetting methods where tasks are embedded in subnetworks (e.g., PackNet), we first provide experiments demonstrating CRL can significantly boost sample efficiency when learning new tasks. To explain this, we establish theoretical guarantees for CRL by providing sample complexity and generalization error bounds for new tasks by formalizing the statistical benefits of previously-learned representations. Our analysis and experiments also highlight the importance of the order in which we learn the tasks. Specifically, we show that CL benefits if the initial tasks have large sample size and high "representation diversity". Diversity ensures that adding new tasks incurs small representation mismatch and can be learned with few samples while training only few additional nonzero weights. Finally, we ask whether one can ensure each task subnetwork to be efficient during inference time while retaining the benefits of representation learning. To this end, we propose an inference-efficient variation of PackNet called Efficient Sparse PackNet (ESPN) which employs joint channel & weight pruning. ESPN embeds tasks in channel-sparse subnets requiring up to 80% less FLOPs to compute while approximately retaining accuracy and is very competitive with a variety of baselines. In summary, this work takes a step towards data and compute-efficient CL with a representation learning perspective. GitHub page: https://github.com/ucr-optml/CtRL
translated by 谷歌翻译
虽然深入学习在固定的大型数据集中取得了重大进展,但它通常遇到关于在开放世界场景,过度参数化和过度拟合小型样本中检测到未知/看不见的课程的挑战。由于生物系统可以很好地克服上述困难,因此个体从集体生物中继承了一个先天基因,这些生物已经进化了数十亿多年,然后通过少数例子学习新技能。灵感来自这一点,我们提出了一个实用的集体个人范式,其中进化(可扩展)网络在顺序任务上培训,然后识别现实世界中的未知课程。此外,提出了学习者,即用于学习目标模型的初始化规则的基因,从集体模型继承了元知识,并在目标任务中重建轻量级各个模型。特别地,根据梯度信息,提出了一种新的标准来发现集体模型中的学习者。最后,只有在目标学习任务上的少量样本才接受培训。我们在广泛的实证研究和理论分析中展示了我们方法的有效性。
translated by 谷歌翻译
While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully chosen trade-off between performance and resource consumption in terms of computation and energy. The development of such approaches is among the major challenges in current machine learning research and key to ensure a smooth transition of machine learning technology from a scientific environment with virtually unlimited computing resources into everyday's applications. In this article, we provide an overview of the current state of the art of machine learning techniques facilitating these real-world requirements. In particular, we focus on deep neural networks (DNNs), the predominant machine learning models of the past decade. We give a comprehensive overview of the vast literature that can be mainly split into three non-mutually exclusive categories: (i) quantized neural networks, (ii) network pruning, and (iii) structural efficiency. These techniques can be applied during training or as post-processing, and they are widely used to reduce the computational demands in terms of memory footprint, inference speed, and energy efficiency. We also briefly discuss different concepts of embedded hardware for DNNs and their compatibility with machine learning techniques as well as potential for energy and latency reduction. We substantiate our discussion with experiments on well-known benchmark datasets using compression techniques (quantization, pruning) for a set of resource-constrained embedded systems, such as CPUs, GPUs and FPGAs. The obtained results highlight the difficulty of finding good trade-offs between resource efficiency and predictive performance.
translated by 谷歌翻译
Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
translated by 谷歌翻译
我们介绍了SubGD,这是一种新颖的几声学习方法,基于最近的发现,即随机梯度下降更新往往生活在低维参数子空间中。在实验和理论分析中,我们表明模型局限于合适的预定义子空间,可以很好地推广用于几次学习。合适的子空间符合给定任务的三个标准:IT(a)允许通过梯度流量减少训练误差,(b)导致模型良好的模型,并且(c)可以通过随机梯度下降来识别。 SUBGD从不同任务的更新说明的自动相关矩阵的特征组合中标识了这些子空间。明确的是,我们可以识别出低维合适的子空间,用于对动态系统的几次学习,而动态系统具有不同的属性,这些属性由分析系统描述的一个或几个参数描述。这种系统在科学和工程领域的现实应用程序中无处不在。我们在实验中证实了SubGD在三个不同的动态系统问题设置上的优势,在样本效率和性能方面,均超过了流行的几次学习方法。
translated by 谷歌翻译
最近,面部生物识别是对传统认证系统的方便替代的巨大关注。因此,检测恶意尝试已经发现具有重要意义,导致面部抗欺骗〜(FAS),即面部呈现攻击检测。与手工制作的功能相反,深度特色学习和技术已经承诺急剧增加FAS系统的准确性,解决了实现这种系统的真实应用的关键挑战。因此,处理更广泛的发展以及准确的模型的新研究区越来越多地引起了研究界和行业的关注。在本文中,我们为自2017年以来对与基于深度特征的FAS方法相关的文献综合调查。在这一主题上阐明,基于各种特征和学习方法的语义分类。此外,我们以时间顺序排列,其进化进展和评估标准(数据集内集和数据集互联集合中集)覆盖了FAS的主要公共数据集。最后,我们讨论了开放的研究挑战和未来方向。
translated by 谷歌翻译
当在具有不同分布的数据集上不断学习时,神经网络往往会忘记以前学习的知识,这一现象被称为灾难性遗忘。数据集之间的分配更改会导致更多的遗忘。最近,基于参数 - 隔离的方法在克服遗忘时具有巨大的潜力。但是,当他们在培训过程中修复每个数据集的神经路径时,他们的概括不佳,并且在推断过程中需要数据集标签。此外,他们不支持向后的知识转移,因为它们优先于过去的数据。在本文中,我们提出了一种名为ADAPTCL的新的自适应学习方法,该方法完全重复使用并在学习的参数上生长,以克服灾难性的遗忘,并允许在不需要数据集标签的情况下进行积极的向后传输。我们提出的技术通过允许最佳的冷冻参数重复使用在相同的神经路径上生长。此外,它使用参数级数据驱动的修剪来为数据分配同等优先级。我们对MNIST变体,域和食物新鲜度检测数据集进行了广泛的实验,而无需数据集标签。结果表明,我们所提出的方法优于替代基线,可以最大程度地减少遗忘和实现积极的向后知识转移。
translated by 谷歌翻译
在许多增强学习(RL)应用中,观察空间由人类开发人员指定并受到物理实现的限制,因此可能会随时间的巨大变化(例如,观察特征的数量增加)。然而,当观察空间发生变化时,前一项策略可能由于输入特征不匹配而失败,并且另一个策略必须从头开始培训,这在计算和采样复杂性方面效率低。在理论上见解之后,我们提出了一种新颖的算法,该算法提取源任务中的潜在空间动态,并将动态模型传送到目标任务用作基于模型的常规程序。我们的算法适用于观察空间的彻底变化(例如,从向量的基于矢量的观察到图像的观察),没有任何任务映射或目标任务的任何先前知识。实证结果表明,我们的算法显着提高了目标任务中学习的效率和稳定性。
translated by 谷歌翻译
物联网中的智能汽车,智能手机和其他设备(物联网)通常具有多个传感器,会产生多模式数据。联合学习支持从不同设备收集大量多模式数据,而无需共享原始数据。转移学习方法有助于将知识从某些设备传输到其他设备。联合转移学习方法受益于联合学习和转移学习。这个新提出的联合转移学习框架旨在将数据岛与隐私保护联系起来。我们的构建基于联合学习和转移学习。与以前的联合转移学习相比,每个用户应具有相同模式的数据(所有单峰或全模式),我们的新框架更为通用,它允许使用用户数据的混合分布。核心策略是为我们的两种用户使用两种不同但固有连接的培训方法。仅对单峰数据(类型1)的用户采用监督学习,而自我监督的学习则用于使用多模式数据(类型2)的用户,以适用于每种模式的功能及其之间的连接。类型2的这种联系知识将在培训的后期阶段有助于1键入1。新框架中的培训可以分为三个步骤。在第一步中,将具有相同模式的数据的用户分组在一起。例如,仅具有声音信号的用户在第一组中,只有图像的用户在第二组中,并且具有多模式数据的用户在第三组中,依此类推。在第二步中,在小组内执行联合学习,在该小组中,根据小组的性质,使用监督的学习和自学学习。大多数转移学习发生在第三步中,从前步骤获得的网络中的相关部分是汇总的(联合)。
translated by 谷歌翻译
Deep learning has been the answer to many machine learning problems during the past two decades. However, it comes with two major constraints: dependency on extensive labeled data and training costs. Transfer learning in deep learning, known as Deep Transfer Learning (DTL), attempts to reduce such dependency and costs by reusing an obtained knowledge from a source data/task in training on a target data/task. Most applied DTL techniques are network/model-based approaches. These methods reduce the dependency of deep learning models on extensive training data and drastically decrease training costs. As a result, researchers detected Covid-19 infection on chest X-Rays with high accuracy at the beginning of the pandemic with minimal data using DTL techniques. Also, the training cost reduction makes DTL viable on edge devices with limited resources. Like any new advancement, DTL methods have their own limitations, and a successful transfer depends on some adjustments for different scenarios. In this paper, we review the definition and taxonomy of deep transfer learning and well-known methods. Then we investigate the DTL approaches by reviewing recent applied DTL techniques in the past five years. Further, we review some experimental analyses of DTLs to learn the best practice for applying DTL in different scenarios. Moreover, the limitations of DTLs (catastrophic forgetting dilemma and overly biased pre-trained models) are discussed, along with possible solutions and research trends.
translated by 谷歌翻译