Human adaptability relies crucially on the ability to learn and merge knowledge both from supervised and unsupervised learning: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the task of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals how to solve a jigsaw puzzle on the same images. This secondary task helps the network to learn the concepts of spatial correlation while acting as a regularizer for the classification task. Multiple experiments on the PACS, VLCS, Office-Home and digits datasets confirm our intuition and show that this simple method outperforms previous domain generalization and adaptation solutions. An ablation study further illustrates the inner workings of our approach.
translated by 谷歌翻译
Domain Adaptation is an actively researched problem in Computer Vision. In this work, we propose an approach that leverages unsupervised data to bring the source and target distributions closer in a learned joint feature space. We accomplish this by inducing a symbiotic relationship between the learned embedding and a generative adversarial network. This is in contrast to methods which use the adversarial framework for realistic data generation and retraining deep models with such data. We demonstrate the strength and generality of our approach by performing experiments on three different tasks with varying levels of difficulty: (1) Digit classification (MNIST, SVHN and USPS datasets) (2) Object recognition using OFFICE dataset and (3) Domain adaptation from synthetic to real data. Our method achieves state-of-the art performance in most experimental settings and by far the only GAN-based method that has been shown to work well across different datasets such as OFFICE and DIGITS.
translated by 谷歌翻译
对分布(OOD)数据的概括是人类自然的能力,但对于机器而言挑战。这是因为大多数学习算法强烈依赖于i.i.d.〜对源/目标数据的假设,这在域转移导致的实践中通常会违反。域的概括(DG)旨在通过仅使用源数据进行模型学习来实现OOD的概括。在过去的十年中,DG的研究取得了长足的进步,导致了广泛的方法论,例如,基于域的一致性,元学习,数据增强或合奏学习的方法,仅举几例;还在各个应用领域进行了研究,包括计算机视觉,语音识别,自然语言处理,医学成像和强化学习。在本文中,首次提供了DG中的全面文献综述,以总结过去十年来的发展。具体而言,我们首先通过正式定义DG并将其与其他相关领域(如域适应和转移学习)联系起来来涵盖背景。然后,我们对现有方法和理论进行了彻底的审查。最后,我们通过有关未来研究方向的见解和讨论来总结这项调查。
translated by 谷歌翻译
This work provides a unified framework for addressing the problem of visual supervised domain adaptation and generalization with deep models. The main idea is to exploit the Siamese architecture to learn an embedding subspace that is discriminative, and where mapped visual domains are semantically aligned and yet maximally separated. The supervised setting becomes attractive especially when only few target data samples need to be labeled. In this scenario, alignment and separation of semantic probability distributions is difficult because of the lack of data. We found that by reverting to point-wise surrogates of distribution distances and similarities provides an effective solution. In addition, the approach has a high "speed" of adaptation, which requires an extremely low number of labeled target training samples, even one per category can be effective. The approach is extended to domain generalization. For both applications the experiments show very promising results.
translated by 谷歌翻译
语义分割在广泛的计算机视觉应用中起着基本作用,提供了全球对图像​​的理解的关键信息。然而,最先进的模型依赖于大量的注释样本,其比在诸如图像分类的任务中获得更昂贵的昂贵的样本。由于未标记的数据替代地获得更便宜,因此无监督的域适应达到了语义分割社区的广泛成功并不令人惊讶。本调查致力于总结这一令人难以置信的快速增长的领域的五年,这包含了语义细分本身的重要性,以及将分段模型适应新环境的关键需求。我们提出了最重要的语义分割方法;我们对语义分割的域适应技术提供了全面的调查;我们揭示了多域学习,域泛化,测试时间适应或无源域适应等较新的趋势;我们通过描述在语义细分研究中最广泛使用的数据集和基准测试来结束本调查。我们希望本调查将在学术界和工业中提供具有全面参考指导的研究人员,并有助于他们培养现场的新研究方向。
translated by 谷歌翻译
Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias. Fine-tuning deep models in a new domain can require a significant amount of labeled data, which for many applications is simply not available. We propose a new CNN architecture to exploit unlabeled and sparsely labeled target domain data. Our approach simultaneously optimizes for domain invariance to facilitate domain transfer and uses a soft label distribution matching loss to transfer information between tasks. Our proposed adaptation method offers empirical performance which exceeds previously published results on two standard benchmark visual domain adaptation tasks, evaluated across supervised and semi-supervised adaptation settings.
translated by 谷歌翻译
Domain generalization (DG) is the challenging and topical problem of learning models that generalize to novel testing domains with different statistics than a set of known training domains. The simple approach of aggregating data from all source domains and training a single deep neural network end-to-end on all the data provides a surprisingly strong baseline that surpasses many prior published methods. In this paper we build on this strong baseline by designing an episodic training procedure that trains a single deep network in a way that exposes it to the domain shift that characterises a novel domain at runtime. Specifically, we decompose a deep network into feature extractor and classifier components, and then train each component by simulating it interacting with a partner who is badly tuned for the current domain. This makes both components more robust, ultimately leading to our networks producing state-of-the-art performance on three DG benchmarks. Furthermore, we consider the pervasive workflow of using an ImageNet trained CNN as a fixed feature extractor for downstream recognition tasks. Using the Visual Decathlon benchmark, we demonstrate that our episodic-DG training improves the performance of such a general purpose feature extractor by explicitly training a feature for robustness to novel problems. This shows that DG training can benefit standard practice in computer vision.
translated by 谷歌翻译
虽然在许多域内生成并提供了大量的未标记数据,但对视觉数据的自动理解的需求高于以往任何时候。大多数现有机器学习模型通常依赖于大量标记的训练数据来实现高性能。不幸的是,在现实世界的应用中,不能满足这种要求。标签的数量有限,手动注释数据昂贵且耗时。通常需要将知识从现有标记域传输到新域。但是,模型性能因域之间的差异(域移位或数据集偏差)而劣化。为了克服注释的负担,域适应(DA)旨在在将知识从一个域转移到另一个类似但不同的域中时减轻域移位问题。无监督的DA(UDA)处理标记的源域和未标记的目标域。 UDA的主要目标是减少标记的源数据和未标记的目标数据之间的域差异,并在培训期间在两个域中学习域不变的表示。在本文中,我们首先定义UDA问题。其次,我们从传统方法和基于深度学习的方法中概述了不同类别的UDA的最先进的方法。最后,我们收集常用的基准数据集和UDA最先进方法的报告结果对视觉识别问题。
translated by 谷歌翻译
无监督域适应(UDA)旨在将知识从相关但不同的良好标记的源域转移到新的未标记的目标域。大多数现有的UDA方法需要访问源数据,因此当数据保密而不相配在隐私问题时,不适用。本文旨在仅使用培训的分类模型来解决现实设置,而不是访问源数据。为了有效地利用适应源模型,我们提出了一种新颖的方法,称为源假设转移(拍摄),其通过将目标数据特征拟合到冻结源分类模块(表示分类假设)来学习目标域的特征提取模块。具体而言,拍摄挖掘出于特征提取模块的信息最大化和自我监督学习,以确保目标特征通过同一假设与看不见的源数据的特征隐式对齐。此外,我们提出了一种新的标签转移策略,它基于预测的置信度(标签信息),然后采用半监督学习来将目标数据分成两个分裂,然后提高目标域中的较为自信预测的准确性。如果通过拍摄获得预测,我们表示标记转移为拍摄++。关于两位数分类和对象识别任务的广泛实验表明,拍摄和射击++实现了与最先进的结果超越或相当的结果,展示了我们对各种视域适应问题的方法的有效性。代码可用于\ url {https://github.com/tim-learn/shot-plus}。
translated by 谷歌翻译
在图像分类中,获得足够的标签通常昂贵且耗时。为了解决这个问题,域适应通常提供有吸引力的选择,给出了来自类似性质但不同域的大量标记数据。现有方法主要对准单个结构提取的表示的分布,并且表示可以仅包含部分信息,例如,仅包含部分饱和度,亮度和色调信息。在这一行中,我们提出了多代表性适应,这可以大大提高跨域图像分类的分类精度,并且特别旨在对准由名为Inception Adaption Adationation模块(IAM)提取的多个表示的分布。基于此,我们呈现多色自适应网络(MRAN)来通过多表示对准完成跨域图像分类任务,该任向性可以捕获来自不同方面的信息。此外,我们扩展了最大的平均差异(MMD)来计算适应损耗。我们的方法可以通过扩展具有IAM的大多数前进模型来轻松实现,并且网络可以通过反向传播有效地培训。在三个基准图像数据集上进行的实验证明了备的有效性。代码已在https://github.com/easezyc/deep-transfer -learning上获得。
translated by 谷歌翻译
深度学习(DL)是各种计算机视觉任务中使用的主要方法,因为它在许多任务上取得了相关结果。但是,在具有部分或没有标记数据的实际情况下,DL方法也容易出现众所周知的域移位问题。多源无监督的域适应性(MSDA)旨在通过从一袋源模型中分配弱知识来学习未标记域的预测指标。但是,大多数作品进行域适应性仅利用提取的特征并从损失函数设计的角度降低其域的转移。在本文中,我们认为仅基于域级特征处理域移动不足,但是在功能空间上对此类信息进行对齐也是必不可少的。与以前的工作不同,我们专注于网络设计,并建议将多源版本的域对齐层(MS-DIAL)嵌入预测变量的不同级别。这些层旨在匹配不同域之间的特征分布,并且可以轻松地应用于各种MSDA方法。为了显示我们方法的鲁棒性,我们考虑了两个具有挑战性的情况:数字识别和对象分类,进行了广泛的实验评估。实验结果表明,我们的方法可以改善最新的MSDA方法,从而在其分类精度上获得 +30.64%的相对增长。
translated by 谷歌翻译
无监督域适应(UDA)旨在将知识从标记的源域传输到未标记的目标域。传统上,基于子空间的方法为此问题形成了一类重要的解决方案。尽管他们的数学优雅和易腐烂性,但这些方法通常被发现在产生具有复杂的现实世界数据集的领域不变的功能时无效。由于近期具有深度网络的代表学习的最新进展,本文重新访问了UDA的子空间对齐,提出了一种新的适应算法,始终如一地导致改进的泛化。与现有的基于对抗培训的DA方法相比,我们的方法隔离了特征学习和分配对准步骤,并利用主要辅助优化策略来有效地平衡域不契约的目标和模型保真度。在提供目标数据和计算要求的显着降低的同时,基于子空间的DA竞争性,有时甚至优于几种标准UDA基准测试的最先进的方法。此外,子空间对准导致本质上定期的模型,即使在具有挑战性的部分DA设置中,也表现出强大的泛化。最后,我们的UDA框架的设计本身支持对测试时间的新目标域的逐步适应,而无需从头开始重新检测模型。总之,由强大的特征学习者和有效的优化策略提供支持,我们将基于子空间的DA建立为可视识别的高效方法。
translated by 谷歌翻译
Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains. They also can improve recognition despite the presence of domain shift or dataset bias: several adversarial approaches to unsupervised domain adaptation have recently been introduced, which reduce the difference between the training and test domain distributions and thus improve generalization performance. Prior generative approaches show compelling visualizations, but are not optimal on discriminative tasks and can be limited to smaller shifts. Prior discriminative approaches could handle larger domain shifts, but imposed tied weights on the model and did not exploit a GAN-based loss. We first outline a novel generalized framework for adversarial adaptation, which subsumes recent state-of-the-art approaches as special cases, and we use this generalized view to better relate the prior approaches. We propose a previously unexplored instance of our general framework which combines discriminative modeling, untied weight sharing, and a GAN loss, which we call Adversarial Discriminative Domain Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and demonstrate the promise of our approach by exceeding state-of-the-art unsupervised adaptation results on standard cross-domain digit classification tasks and a new more difficult cross-modality object classification task.
translated by 谷歌翻译
无监督的域适应性(DA)中的主要挑战是减轻源域和目标域之间的域移动。先前的DA工作表明,可以使用借口任务来通过学习域不变表示来减轻此域的转移。但是,实际上,我们发现大多数现有的借口任务对其他已建立的技术无效。因此,我们从理论上分析了如何以及何时可以利用子公司借口任务来协助给定DA问题的目标任务并制定客观的子公司任务适用性标准。基于此标准,我们设计了一个新颖的贴纸干预过程和铸造贴纸分类的过程,作为监督的子公司DA问题,该问题与目标任务无监督的DA同时发生。我们的方法不仅改善了目标任务适应性能,而且还促进了面向隐私的无源DA,即没有并发源目标访问。标准Office-31,Office-Home,Domainnet和Visda基准的实验证明了我们对单源和多源无源DA的优势。我们的方法还补充了现有的无源作品,从而实现了领先的绩效。
translated by 谷歌翻译
Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
translated by 谷歌翻译
大多数现有的多源域适配(MSDA)方法通过特征分布对准最小化多个源 - 目标域对之间的距离,从单个源设置借用的方法。但是,对于不同的源极域,对齐成对特征分布是具有挑战性的,甚至可以对MSDA进行反效率。在本文中,我们介绍了一种新颖的方法:可转让的属性学习。动机很简单:虽然不同的域可以具有急剧不同的视野,但它们包含相同的类类,其特征在一起相同的属性;因此,MSDA模型应该专注于学习目标域的最可转换的属性。采用这种方法,我们提出了域名关注一致性网络,称为DAC网。关键设计是一个特征通道注意模块,旨在识别可转移功能(属性)。重要的是,注意模块受到一致性损失的监督,这对源极和目标域之间的信道注意权重的分布施加。此外,为了促进对目标数据的鉴别特征学习,我们将伪标记与类紧凑性丢失相结合,以最小化目标特征和分类器的权重向量之间的距离。在三个MSDA基准测试中进行了广泛的实验表明,我们的DAC-NET在所有这些中实现了新的最新性能。
translated by 谷歌翻译
域的概括(DG)旨在在一个或多个不同但相关的源域上学习一个模型,这些模型可以推广到看不见的目标域。现有的DG方法试图提示模型的概括能力的源域的多样性,同时他们可能必须引入辅助网络或达到计算成本。相反,这项工作应用了特征空间中的隐式语义增强来捕获源域的多样性。具体来说,包括距离度量学习(DML)的附加损失函数,以优化数据分布的局部几何形状。此外,采用跨熵损失的逻辑被无限增强作为DML损失的输入特征,以代替深度特征。我们还提供了理论分析,以表明逻辑可以近似于原始特征上定义的距离。此外,我们对方法背后的机制和理性进行了深入的分析,这使我们可以更好地了解为什么要代替特征的杠杆逻辑可以帮助域的概括。拟议的DML损失与隐式增强作用纳入了最近的DG方法中,即傅立叶增强联合老师框架(FACT)。同时,我们的方法也可以轻松地插入各种DG方法中。对三个基准测试(Digits-DG,PAC和办公室家庭)进行的广泛实验表明,该建议的方法能够实现最新的性能。
translated by 谷歌翻译
主流最先进的域泛化算法倾向于优先考虑跨域语义不变性的假设。同时,固有的域内风格不变性通常被低估并放在架子上。在本文中,我们揭示了利用域内风格的不变性,在提高域泛化效率方面也具有关键重要性。我们验证了网络对域功能不变并在实例之间共享的内容至关重要,以便网络锐化其理解并提高其语义判别能力。相应地,我们还提出了一种新颖的“陪审团”机制,在域之间学习有用的语义特征共性特别有效。我们的完整型号称为Steam可以被解释为新颖的概率图形模型,该图形模型需要方便的两种内存库的方便结构:语义特征银行和风格的功能库。经验结果表明,我们的拟议框架通过清晰的边缘超越了最先进的方法。
translated by 谷歌翻译
The problem of domain generalization is to learn from multiple training domains, and extract a domain-agnostic model that can then be applied to an unseen domain. Domain generalization (DG) has a clear motivation in contexts where there are target domains with distinct characteristics, yet sparse data for training. For example recognition in sketch images, which are distinctly more abstract and rarer than photos. Nevertheless, DG methods have primarily been evaluated on photo-only benchmarks focusing on alleviating the dataset bias where both problems of domain distinctiveness and data sparsity can be minimal. We argue that these benchmarks are overly straightforward, and show that simple deep learning baselines perform surprisingly well on them.In this paper, we make two main contributions: Firstly, we build upon the favorable domain shift-robust properties of deep learning methods, and develop a low-rank parameterized CNN model for end-to-end DG learning. Secondly, we develop a DG benchmark dataset covering photo, sketch, cartoon and painting domains. This is both more practically relevant, and harder (bigger domain shift) than existing benchmarks. The results show that our method outperforms existing DG alternatives, and our dataset provides a more significant DG challenge to drive future research.
translated by 谷歌翻译
Deep domain adaptation has emerged as a new learning technique to address the lack of massive amounts of labeled data. Compared to conventional methods, which learn shared feature subspaces or reuse important source instances with shallow representations, deep domain adaptation methods leverage deep networks to learn more transferable representations by embedding domain adaptation in the pipeline of deep learning. There have been comprehensive surveys for shallow domain adaptation, but few timely reviews the emerging deep learning based methods. In this paper, we provide a comprehensive survey of deep domain adaptation methods for computer vision applications with four major contributions. First, we present a taxonomy of different deep domain adaptation scenarios according to the properties of data that define how two domains are diverged. Second, we summarize deep domain adaptation approaches into several categories based on training loss, and analyze and compare briefly the state-of-the-art methods under these categories. Third, we overview the computer vision applications that go beyond image classification, such as face recognition, semantic segmentation and object detection. Fourth, some potential deficiencies of current methods and several future directions are highlighted.
translated by 谷歌翻译