Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models applied in feature spaces discover domain invariant representations, but are difficult to visualize and sometimes fail to capture pixel-level and low-level domain shifts. Recent work has shown that generative adversarial networks combined with cycle-consistency constraints are surprisingly effective at mapping images between domains, even without the use of aligned image pairs. We propose a novel discriminatively-trained Cycle-Consistent Adversarial Domain Adaptation model. CyCADA adapts representations at both the pixel-level and feature-level, enforces cycle-consistency while leveraging a task loss, and does not require aligned pairs. Our model can be applied in a variety of visual recognition and prediction settings. We show new state-of-the-art results across multiple adaptation tasks, including digit classification and semantic segmentation of road scenes demonstrating transfer from synthetic to real world domains.
translated by 谷歌翻译
We propose a general framework for unsupervised domain adaptation, which allows deep neural networks trained on a source domain to be tested on a different target domain without requiring any training annotations in the target domain. This is achieved by adding extra networks and losses that help regularize the features extracted by the backbone encoder network. To this end we propose the novel use of the recently proposed unpaired image-toimage translation framework to constrain the features extracted by the encoder network. Specifically, we require that the features extracted are able to reconstruct the images in both domains. In addition we require that the distribution of features extracted from images in the two domains are indistinguishable. Many recent works can be seen as specific cases of our general framework. We apply our method for domain adaptation between MNIST, USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in classification tasks, and also between GTA5 and Cityscapes datasets for a segmentation task. We demonstrate state of the art performance on each of these datasets.
translated by 谷歌翻译
Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains. They also can improve recognition despite the presence of domain shift or dataset bias: several adversarial approaches to unsupervised domain adaptation have recently been introduced, which reduce the difference between the training and test domain distributions and thus improve generalization performance. Prior generative approaches show compelling visualizations, but are not optimal on discriminative tasks and can be limited to smaller shifts. Prior discriminative approaches could handle larger domain shifts, but imposed tied weights on the model and did not exploit a GAN-based loss. We first outline a novel generalized framework for adversarial adaptation, which subsumes recent state-of-the-art approaches as special cases, and we use this generalized view to better relate the prior approaches. We propose a previously unexplored instance of our general framework which combines discriminative modeling, untied weight sharing, and a GAN loss, which we call Adversarial Discriminative Domain Adaptation (ADDA). We show that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and demonstrate the promise of our approach by exceeding state-of-the-art unsupervised adaptation results on standard cross-domain digit classification tasks and a new more difficult cross-modality object classification task.
translated by 谷歌翻译
语义分割在广泛的计算机视觉应用中起着基本作用,提供了全球对图像​​的理解的关键信息。然而,最先进的模型依赖于大量的注释样本,其比在诸如图像分类的任务中获得更昂贵的昂贵的样本。由于未标记的数据替代地获得更便宜,因此无监督的域适应达到了语义分割社区的广泛成功并不令人惊讶。本调查致力于总结这一令人难以置信的快速增长的领域的五年,这包含了语义细分本身的重要性,以及将分段模型适应新环境的关键需求。我们提出了最重要的语义分割方法;我们对语义分割的域适应技术提供了全面的调查;我们揭示了多域学习,域泛化,测试时间适应或无源域适应等较新的趋势;我们通过描述在语义细分研究中最广泛使用的数据集和基准测试来结束本调查。我们希望本调查将在学术界和工业中提供具有全面参考指导的研究人员,并有助于他们培养现场的新研究方向。
translated by 谷歌翻译
Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
translated by 谷歌翻译
在偏置数据集上培训的分类模型通常在分发外部的外部样本上表现不佳,因为偏置的表示嵌入到模型中。最近,已经提出了各种脱叠方法来解除偏见的表示,但仅丢弃偏见的特征是具有挑战性的,而不会改变其他相关信息。在本文中,我们提出了一种新的扩展方法,该方法使用不同标记图像的纹理表示明确地生成附加图像来放大训练数据集,并在训练分类器时减轻偏差效果。每个新的生成图像包含来自源图像的类似内容信息,同时从具有不同标签的目标图像传送纹理。我们的模型包括纹理共发生损耗,该损耗确定生成的图像的纹理是否与目标的纹理类似,以及确定所生成和源图像之间的内容细节是否保留的内容细节的空间自相似性丢失。生成和原始训练图像都进一步用于训练能够改善抗偏置表示的鲁棒性的分类器。我们使用具有已知偏差的五个不同的人工设计数据集来展示我们的方法缓解偏差信息的能力。对于所有情况,我们的方法表现优于现有的现有最先进的方法。代码可用:https://github.com/myeongkyunkang/i2i4debias
translated by 谷歌翻译
使用合成数据来训练在现实世界数据上实现良好性能的神经网络是一项重要任务,因为它可以减少对昂贵数据注释的需求。然而,合成和现实世界数据具有域间隙。近年来,已经广泛研究了这种差距,也称为域的适应性。通过直接执行两者之间的适应性来缩小源(合成)和目标数据之间的域间隙是具有挑战性的。在这项工作中,我们提出了一个新颖的两阶段框架,用于改进图像数据上的域适应技术。在第一阶段,我们逐步训练一个多尺度神经网络,以从源域到目标域进行图像翻译。我们将新的转换数据表示为“目标中的源”(SIT)。然后,我们将生成的SIT数据插入任何标准UDA方法的输入。该新数据从所需的目标域缩小了域间隙,这有助于应用UDA进一步缩小差距的方法。我们通过与其他领先的UDA和图像对图像翻译技术进行比较来强调方法的有效性,当时用作SIT发电机。此外,我们通过三种用于语义分割的最先进的UDA方法(HRDA,daformer and proda)在两个UDA任务上,GTA5到CityScapes和Synthia to CityScapes来证明我们的框架的改进。
translated by 谷歌翻译
Collecting well-annotated image datasets to train modern machine learning algorithms is prohibitively expensive for many tasks. An appealing alternative is to render synthetic data where ground-truth annotations are generated automatically. Unfortunately, models trained purely on rendered images often fail to generalize to real images. To address this shortcoming, prior work introduced unsupervised domain adaptation algorithms that attempt to map representations between the two domains or learn to extract features that are domain-invariant. In this work, we present a new approach that learns, in an unsupervised manner, a transformation in the pixel space from one domain to the other. Our generative adversarial network (GAN)-based model adapts source-domain images to appear as if drawn from the target domain. Our approach not only produces plausible samples, but also outperforms the state-of-the-art on a number of unsupervised domain adaptation scenarios by large margins. Finally, we demonstrate that the adaptation process generalizes to object classes unseen during training.
translated by 谷歌翻译
Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source ground truth labels to the target domain is of great interest. In this paper, we propose an adversarial learning method for domain adaptation in the context of semantic segmentation. Considering semantic segmentations as structured outputs that contain spatial similarities between the source and target domains, we adopt adversarial learning in the output space. To further enhance the adapted model, we construct a multi-level adversarial network to effectively perform output space domain adaptation at different feature levels. Extensive experiments and ablation study are conducted under various domain adaptation settings, including synthetic-to-real and cross-city scenarios. We show that the proposed method performs favorably against the stateof-the-art methods in terms of accuracy and visual quality.
translated by 谷歌翻译
In this paper, we investigate a challenging unsupervised domain adaptation setting -unsupervised model adaptation. We aim to explore how to rely only on unlabeled target data to improve performance of an existing source prediction model on the target domain, since labeled source data may not be available in some real-world scenarios due to data privacy issues. For this purpose, we propose a new framework, which is referred to as collaborative class conditional generative adversarial net to bypass the dependence on the source data. Specifically, the prediction model is to be improved through generated target-style data, which provides more accurate guidance for the generator. As a result, the generator and the prediction model can collaborate with each other without source data. Furthermore, due to the lack of supervision from source data, we propose a weight constraint that encourages similarity to the source model. A clustering-based regularization is also introduced to produce more discriminative features in the target domain. Compared to conventional domain adaptation methods, our model achieves superior performance on multiple adaptation tasks with only unlabeled target data, which verifies its effectiveness in this challenging setting.
translated by 谷歌翻译
Semantic segmentation is a key problem for many computer vision tasks. While approaches based on convolutional neural networks constantly break new records on different benchmarks, generalizing well to diverse testing environments remains a major challenge. In numerous real world applications, there is indeed a large gap between data distributions in train and test domains, which results in severe performance loss at run-time. In this work, we address the task of unsupervised domain adaptation in semantic segmentation with losses based on the entropy of the pixel-wise predictions. To this end, we propose two novel, complementary methods using (i) an entropy loss and (ii) an adversarial loss respectively. We demonstrate state-of-theart performance in semantic segmentation on two challenging "synthetic-2-real" set-ups 1 and show that the approach can also be used for detection.
translated by 谷歌翻译
域的适应性引起了极大的兴趣,因为标签是一项昂贵且容易出错的任务,尤其是当像素级在语义分段中需要标签时。因此,人们希望能够在数据丰富并且标签精确的合成域上训练神经网络。但是,这些模型通常在室外图像上表现不佳。为了减轻输入的变化,可以使用图像到图像的方法。然而,使用合成训练域桥接部署领域的标准图像到图像方法并不关注下游任务,而仅关注视觉检查级别。因此,我们在图像到图像域的适应方法中提出了gan的“任务意识”版本。借助少量标记的地面真实数据,我们将图像到图像翻译指导为更合适的输入图像,用于培训合成数据(合成域专家)的语义分割网络。这项工作的主要贡献是1)一种模块化半监督域适应方法,通过训练下游任务Aware Cycean,同时避免适应合成语义分割专家2)该方法适用于复杂的域适应任务3)通过使用从头开始网络进行较不偏见的域间隙分析。我们在分类任务以及语义细分方面评估我们的方法。我们的实验表明,我们的方法比仅使用70(10%)地面真实图像的分类任务中的准确性优于标准图像到图像方法 - 准确性的准确性7%。对于语义细分,我们可以在训练过程中仅使用14个地面真相图像,在均值评估数据集上,平均交叉点比联合的平均交叉点约4%至7%。
translated by 谷歌翻译
We consider the problem of unsupervised domain adaptation in semantic segmentation. A key in this campaign consists in reducing the domain shift, i.e., enforcing the data distributions of the two domains to be similar. One of the common strategies is to align the marginal distribution in the feature space through adversarial learning. However, this global alignment strategy does not consider the category-level joint distribution. A possible consequence of such global movement is that some categories which are originally well aligned between the source and target may be incorrectly mapped, thus leading to worse segmentation results in target domain. To address this problem, we introduce a category-level adversarial network, aiming to enforce local semantic consistency during the trend of global alignment. Our idea is to take a close look at the category-level joint distribution and align each class with an adaptive adversarial loss. Specifically, we reduce the weight of the adversarial loss for category-level aligned features while increasing the adversarial force for those poorly aligned. In this process, we decide how well a feature is category-level aligned between source and target by a co-training approach. In two domain adaptation tasks, i.e., GTA5 → Cityscapes and SYN-THIA → Cityscapes, we validate that the proposed method matches the state of the art in segmentation accuracy.
translated by 谷歌翻译
无监督的域适应(UDA)旨在使源域上培训的模型适应到新的目标域,其中没有可用标记的数据。在这项工作中,我们调查从合成计算机生成的域的UDA的问题,以用于学习语义分割的类似但实际的域。我们提出了一种与UDA的一致性正则化方法结合的语义一致的图像到图像转换方法。我们克服了将合成图像转移到真实的图像的先前限制。我们利用伪标签来学习生成的图像到图像转换模型,该图像到图像转换模型从两个域上的语义标签接收额外的反馈。我们的方法优于最先进的方法,将图像到图像转换和半监督学习与相关域适应基准,即Citycapes和Synthia上的CutyCapes和Synthia进行了全面的学习。
translated by 谷歌翻译
Deep domain adaptation has emerged as a new learning technique to address the lack of massive amounts of labeled data. Compared to conventional methods, which learn shared feature subspaces or reuse important source instances with shallow representations, deep domain adaptation methods leverage deep networks to learn more transferable representations by embedding domain adaptation in the pipeline of deep learning. There have been comprehensive surveys for shallow domain adaptation, but few timely reviews the emerging deep learning based methods. In this paper, we provide a comprehensive survey of deep domain adaptation methods for computer vision applications with four major contributions. First, we present a taxonomy of different deep domain adaptation scenarios according to the properties of data that define how two domains are diverged. Second, we summarize deep domain adaptation approaches into several categories based on training loss, and analyze and compare briefly the state-of-the-art methods under these categories. Third, we overview the computer vision applications that go beyond image classification, such as face recognition, semantic segmentation and object detection. Fourth, some potential deficiencies of current methods and several future directions are highlighted.
translated by 谷歌翻译
近年来,深度学习已成为遥感科学家最有效的计算机视觉工具之一。但是,遥感数据集缺乏培训标签,这意味着科学家需要解决域适应性问题,以缩小卫星图像数据集之间的差异。结果,随后训练的图像分割模型可以更好地概括并使用现有的一组标签,而不需要新的标签。这项工作提出了一个无监督的域适应模型,该模型可在样式转移阶段保留图像的语义一致性和每个像素质量。本文的主要贡献是提出了SEMI2I模型的改进体系结构,该模型显着提高了所提出的模型的性能,并使其与最先进的Cycada模型具有竞争力。第二个贡献是在遥感多波段数据集(例如Worldview-2和Spot-6)上测试Cycada模型。提出的模型可在样式传递阶段保留图像的语义一致性和每个像素质量。因此,与SEMI2I模型相比,经过适应图像的训练的语义分割模型显示出可观的性能增长,并达到与最先进的Cycada模型相似的结果。所提出方法的未来开发可能包括生态领域转移,{\ em先验}对数据分布的质量评估,或探索域自适应模型的内部体系结构。
translated by 谷歌翻译
In this work, we present a method for unsupervised domain adaptation. Many adversarial learning methods train domain classifier networks to distinguish the features as either a source or target and train a feature generator network to mimic the discriminator. Two problems exist with these methods. First, the domain classifier only tries to distinguish the features as a source or target and thus does not consider task-specific decision boundaries between classes. Therefore, a trained generator can generate ambiguous features near class boundaries. Second, these methods aim to completely match the feature distributions between different domains, which is difficult because of each domain's characteristics.To solve these problems, we introduce a new approach that attempts to align distributions of source and target by utilizing the task-specific decision boundaries. We propose to maximize the discrepancy between two classifiers' outputs to detect target samples that are far from the support of the source. A feature generator learns to generate target features near the support to minimize the discrepancy. Our method outperforms other methods on several datasets of image classification and semantic segmentation. The codes are available at https://github. com/mil-tokyo/MCD_DA
translated by 谷歌翻译
域适应是一种解决未经看线环境中缺乏大量标记数据的技术。提出了无监督的域适应,以使模型适用于使用单独标记的源数据和未标记的目标域数据的新模式。虽然已经提出了许多图像空间域适配方法来捕获像素级域移位,但是这种技术可能无法维持分割任务的高电平语义信息。对于生物医学图像的情况,在域之间的图像转换操作期间,诸如血管的细细节可能会丢失。在这项工作中,我们提出了一种模型,它使用周期 - 一致丢失在域之间适应域,同时通过在适应过程中强制执行基于边缘的损耗来维持原始图像的边缘细节。我们通过将其与其他两只眼底血管分割数据集的其他方法进行比较来证明我们的算法的有效性。与SOTA和〜5.2增量相比,我们达到了1.1〜9.2递增的骰子分数。
translated by 谷歌翻译
Domain adaptation for semantic image segmentation is very necessary since manually labeling large datasets with pixel-level labels is expensive and time consuming. Existing domain adaptation techniques either work on limited datasets, or yield not so good performance compared with supervised learning. In this paper, we propose a novel bidirectional learning framework for domain adaptation of segmentation. Using the bidirectional learning, the image translation model and the segmentation adaptation model can be learned alternatively and promote to each other. Furthermore, we propose a self-supervised learning algorithm to learn a better segmentation adaptation model and in return improve the image translation model. Experiments show that our method is superior to the state-of-the-art methods in domain adaptation of segmentation with a big margin. The source code is available at https://github.com/liyunsheng13/BDL.
translated by 谷歌翻译