We propose a novel method for unsupervised domain adaptation. Traditional machine learning algorithms often fail to generalize to new input distributions, causing reduced accuracy. Domain adaptation attempts to compensate for the performance degradation by transferring and adapting source knowledge to target domain. Existing unsu-pervised methods project domains into a lower-dimensional space and attempt to align the subspace bases, effectively learning a mapping from source to target points or vice versa. However, they fail to take into account the difference of the two distributions in the subspaces, resulting in misalignment even after adaptation. We present a unified view of existing subspace mapping based methods and develop a generalized approach that also aligns the distributions as well as the subspace bases. We provide a detailed evaluation of our approach on benchmark datasets and show improved results over published approaches.
translated by 谷歌翻译
When designing classifiers for classification tasks, one is often confronted with situations where data distributions in the source domain are different from those present in the target domain. This problem of domain adaptation is an important problem that has received a lot of attention in recent years. In this paper, we study the challenging problem of unsupervised domain adaptation, where no labels are available in the target domain. In contrast to earlier works, which assume a single domain shift between the source and target domains , we allow for multiple domain shifts. Towards this, we develop a novel framework based on the parallel transport of union of the source subspaces on the Grass-mann manifold. Various recognition experiments show that this way of modeling data with union of subspaces instead of a single subspace improves the recognition performance.
translated by 谷歌翻译
In this paper, we introduce a new domain adaptation (DA) algorithm where the source and target domains are represented by subspaces spanned by eigenvectors. Our method seeks a domain invariant feature space by learning a mapping function which aligns the source subspace with the target one. We show that the solution of the corresponding optimization problem can be obtained in a simple closed form, leading to an extremely fast algorithm. We present two approaches to determine the only hyper-parameter in our method corresponding to the size of the subspaces. In the first approach we tune the size of subspaces using a theoretical bound on the stability of the obtained result. In the second approach , we use maximum likelihood estimation to determine the subspace size, which is particularly useful for high dimensional data. Apart from PCA, we propose a subspace creation method that outperform partial least squares (PLS) and linear discriminant analysis (LDA) in domain adaptation. We test our method on various datasets and show that, despite its intrinsic simplicity , it outperforms state of the art DA methods.
translated by 谷歌翻译
Recent studies have shown that recognition datasets are biased. Paying no heed to those biases, learning algorithms often result in classifiers with poor cross-dataset generalization. We are developing domain adaptation techniques to overcome those biases and yield classifiers with significantly improved performance when generalized to new testing datasets. Our work enables us to continue to harvest the benefits of existing vision datasets for the time being. Moreover, it also sheds insights about how to construct new ones. In particular, domain adaptation raises the bar for collecting data-the most informative data are those which cannot be classified well by learning algorithms that adapt from existing datasets.
translated by 谷歌翻译
In pattern recognition and computer vision, one is often faced with scenarios where the training data used to learn a model has different distribution from the data on which the model is applied. Regardless of the cause, any distributional change that occurs after learning a classifier can degrade performance at test time. Domain adaptation tries to lessen this degradation. In this paper, we provide an overview of domain adaptation for visual recognition. We discuss merits and drawbacks of existing domain adaptation approaches and identify promising avenues for research in this rapidly evolving field.
translated by 谷歌翻译
With unconstrained data acquisition scenarios widely prevalent, the ability to handle changes in data distribution across training and testing data sets becomes important. One way to approach this problem is through domain adaptation, and in this paper we primarily focus on the unsupervised scenario where the labeled source domain training data is accompanied by unlabeled target domain test data. We present a two-stage data-driven approach by generating intermediate data representations that could provide relevant information on the domain shift. Starting with a linear representation of domains in the form of generative subspaces of same dimensions for the source and target domains, we first utilize the underlying geometry of the space of these subspaces, the Grassmann manifold, to obtain a 'shortest' geodesic path between the two domains. We then sample points along the geodesic to obtain intermediate cross-domain data representations, using which a discriminative classifier is learnt to estimate the labels of the target data. We subsequently incorporate non-linear representation of domains by considering an Reproducing Kernel Hilbert Space representation, and a low-dimensional manifold representation using Laplacian Eigenmaps, and also examine other domain adaptation settings such as (i) semi-supervised adaptation where the target domain is partially labeled, and (ii) multi-domain adaptation where there could be more than one domain in the source and/or target. Finally, we supplement our adaptation technique with (i) fine-grained reference domains that are created by blending samples from the source and the target to provide some evidence on the actual domain shift, and (ii) a multi-class boosting analysis to obtain robustness to the choice of algorithm parameters. We evaluate our approach for object recognition problems and report competitive results on two widely used Office and Bing adaptation datasets.
translated by 谷歌翻译
Adapting the classifier trained on a source domain to recognize instances from a new target domain is an important problem that is receiving recent attention. In this paper , we present one of the first studies on unsupervised domain adaptation in the context of object recognition, where we have labeled data only from the source domain (and therefore do not have correspondences between object categories across domains). Motivated by incremental learning , we create intermediate representations of data between the two domains by viewing the generative subspaces (of same dimension) created from these domains as points on the Grassmann manifold, and sampling points along the geodesic between them to obtain subspaces that provide a meaningful description of the underlying domain shift. We then obtain the projections of labeled source domain data onto these subspaces, from which a discriminative classi-fier is learnt to classify projected data from the target domain. We discuss extensions of our approach for semi-supervised adaptation, and for cases with multiple source and target domains, and report competitive results on standard datasets.
translated by 谷歌翻译
与人类学习不同,机器学习通常无法处理(源)和测试(目标)输入分布之间的变化。在实际场景中常见的这种域移位严重损害了传统机器学习方法的性能。已经针对目标数据具有标签的情况提出了监督域适应方法,包括一些尽管“令人沮丧地容易”实施但仍然很好地执行的标签。然而,在实践中,目标域通常是未标记的,需要无监督的适应性。我们提出了一种简单,有效且有效的无监督域自适应方法,称为CORrelation ALignment(CORAL)。 CORAL通过对齐源和目标分布的二阶统计信息来最小化域移位,而不需要任何目标标签。即使它非常简单 - 它可以用四行Matlabcode实现 - CORAL在标准基准数据集的广泛评估中表现非常好。
translated by 谷歌翻译
Domain adaptation (DA) aims to generalize a learning model across training and testing data despite the mismatch of their data distributions. In light of a theoretical estimation of upper error bound, we argue in this paper that an effective DA method should 1) search a shared feature subspace where source and target data are not only aligned in terms of distributions as most state of the art DA methods do, but also discriminative in that instances of different classes are well separated; 2) account for the geometric structure of the underlying data manifold when inferring data labels on the target domain. In comparison with a baseline DA method which only cares about data distribution alignment between source and target, we derive three different DA models, namely CDDA, GA-DA, and DGA-DA, to highlight the contribution of Close yet Discriminative DA(CDDA) based on 1), Geometry Aware DA (GA-DA) based on 2), and finally Discriminative and Geometry Aware DA (DGA-DA) implementing jointly 1) and 2). Using both synthetic and real data, we show the effectiveness of the proposed approach which consistently outperforms state of the art DA methods over 36 image classification DA tasks through 6 popular benchmarks. We further carry out in-depth analysis of the proposed DA method in quantifying the contribution of each term of our DA model and provide insights into the proposed DA methods in visualizing both real and synthetic data.
translated by 谷歌翻译
Domain-invariant representations are key to addressing the domain shift problem where the training and test examples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be directly suitable for such a comparison, since some of the features may have been distorted by the domain shift, or may be domain specific. In this paper, we introduce a Domain Invariant Projection approach: An unsupervised domain adaptation method that overcomes this issue by extracting the information that is invariant across the source and target domains. More specifically, we learn a projection of the data to a low-dimensional latent space where the distance between the empirical distributions of the source and target examples is minimized. We demonstrate the effectiveness of our approach on the task of visual object recognition and show that it outperforms state-of-the-art methods on a standard domain adaptation benchmark dataset.
translated by 谷歌翻译
视觉域适应旨在通过利用来自源域的知识来学习目标域的强健分类器。现有方法试图对齐跨域分布,或执行多种子空间学习。然而,存在两个重大挑战:(1)退化特征变换,这意味着分布对齐经常在原始特征空间中执行,其中特征变形难以克服。另一方面,子空间学习不足以减少分布差异。 (2)未评估的分布对齐,这意味着现有的分布对齐方法仅对边际条件和条件分布具有相同的重要性,而它们无法评估这两种分布在实际应用中的不同重要性。在本文中,我们提出了一种Manifold嵌入式分布对齐(MEDA)方法来应对这些挑战。 MEDA学习了具有结构风险最小化的格拉斯曼流形中的域不变分类器,同时执行动态分布对齐以定量地解释边际和条件分布的相对重要性。据我们所知,MEDA是第一次尝试对多种域适应进行动态分布对齐。大量实验表明,与现有技术的传统方法和深层方法相比,MEDA显着提高了分类准确性。
translated by 谷歌翻译
Domain adaptation (DA) addresses the real-world image classification problem of discrepancy between training (source) and testing (target) data distributions. We propose an unsupervised DA method that considers the presence of only unlabelled data in the target domain. Our approach centers on finding matches between samples of the source and target domains. The matches are obtained by treating the source and target domains as hyper-graphs and carrying out a class-regularized hyper-graph matching using first-, second-and third-order similarities between the graphs. We have also developed a computationally efficient algorithm by initially selecting a subset of the samples to construct a graph and then developing a customized optimization routine for graph-matching based on Conditional Gradient and Alternating Direction Multiplier Method. This allows the proposed method to be used widely. We also performed a set of experiments on standard object recognition datasets to validate the effectiveness of our framework over previous approaches.
translated by 谷歌翻译
It is expensive to obtain labeled real-world visual data for use in training of supervised algorithms. Therefore , it is valuable to leverage existing databases of labeled data. However, the data in the source databases is often obtained under conditions that differ from those in the new task. Transfer learning provides techniques for transferring learned knowledge from a source domain to a target domain by finding a mapping between them. In this paper, we discuss a method for projecting both source and target data to a generalized subspace where each target sample can be represented by some combination of source samples. By employing a low-rank constraint during this transfer, the structure of source and target domains are preserved. This approach has three benefits. First, good alignment between the domains is ensured through the use of only relevant data in some subspace of the source domain in reconstructing the data in the target domain. Second, the discriminative power of the source domain is naturally passed on to the target domain. Third, noisy information will be filtered out during knowledge transfer. Extensive experiments on synthetic data, and important computer vision problems such as face recognition application and visual domain adaptation for object recognition demonstrate the superiority of the proposed approach over the existing, well-established methods.
translated by 谷歌翻译
We study the problem of unsupervised domain adaptation, which aims to adapt classi-fiers trained on a labeled source domain to an unlabeled target domain. Many existing approaches first learn domain-invariant features and then construct classifiers with them. We propose a novel approach that jointly learn the both. Specifically, while the method identifies a feature space where data in the source and the target domains are similarly distributed, it also learns the feature space discriminatively, optimizing an information-theoretic metric as an proxy to the expected misclassification error on the target domain. We show how this optimization can be effectively carried out with simple gradient-based methods and how hyperparameters can be cross-validated without demanding any labeled data from the target domain. Empirical studies on benchmark tasks of object recognition and sentiment analysis validated our modeling assumptions and demonstrated significant improvement of our method over competing ones in classification accuracies.
translated by 谷歌翻译
深度神经网络能够从大量标记的输入数据中学习强大的表示,但是它们不能总是概括输入分布中的焊接变化。已经提出了域自适应算法来补偿由于域移位引起的性能下降。在本文中,我们解决了目标域未标记的情况,需要无监督的自适应。 CORAL是一种“令人沮丧的”无监督域调整方法,它将源和目标分布的二阶统计量与线性变换对齐。在这里,我们扩展CORAL容忍一个非线性变换,该变换对齐深层神经网络(Deep CORAL)中层激活的相关性。标准基准数据集的实验显示了最先进的性能。
translated by 谷歌翻译
深度域自适应的最新进展表明,对抗性学习可以嵌入到深度网络中,以学习可传递的特征,这些特征可以减少源域和目标域之间的分布差异。现有的基于单域鉴别器的域对等自适应方法仅在不利用复数多模结构的情况下对源数据分布和目标数据分布进行了分析。在本文中,我们提出了一种多对抗域适应(MADA)方法,该方法捕获多模结构,以基于多个domaindisciminiminator实现不同数据分布的细粒度对齐。可以通过随机梯度下降来实现自适应,其中梯度通过在线性时间中的反向传播来计算。经验证据表明,所提出的模型在标准域适应数据集上优于现有技术方法。
translated by 谷歌翻译
Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsuper-vised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate sub-spaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present experiments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art.
translated by 谷歌翻译
Domain adaptation manages to build an effective target classifier or regression model for unlabeled target data by utilizing the well-labeled source data but lying different distributions. Intuitively, to address domain shift problem, it is crucial to learn domain invariant features across domains, and most existing approaches have concentrated on it. However, they often don't directly constrain the learned features to be class discriminative for both source and target data, which is of vital importance for the final classification. Therefore, in this paper, we put forward a novel feature learning method for domain adaptation to construct both Domain Invariant and Class Discriminative representations, referred to as DICD. Specifically, DICD is to learn a latent feature space with important data properties preserved, which reduces the domain difference by jointly matching the marginal and class-conditional distributions of both domains, and simultaneously maximizes the inter-class dispersion and minimizes the intra-class scatter as much as possible. Experiments in this paper have demonstrated that the class discriminative properties will dramatically alleviate the cross-domain distribution inconsistency, which further boosts the classification performance. Moreover, we show that exploring both domain invariance and class discriminativeness of the learned representations can be integrated into one optimization framework, and the optimal solution can be derived effectively by solving a generalized Eigen-decomposition problem. Comprehensive experiments on several visual cross-domain classification tasks verify that DICD can outperform the competitors significantly .
translated by 谷歌翻译
域适应是指通过利用来自源域的数据来学习目标域中的预测模型的过程。许多经典方法通过建立共同的潜在空间来解决域适应问题,这可能导致两个域中许多重要属性的丢失。在这份手稿中,我们开发了一种新方法,转移潜在表现(TLR),以学习更好的潜在空间。具体而言,我们基于简单的线性自动编码器设计目标函数,以导出两个域的潜在表示。自动编码器中的编码器旨在将两个域的数据投影到鲁棒的潜在空间中。此外,解码还有一个额外的约束来重建原始数据,这可以保留两个域的共同属性并减少导致域移位的噪声。跨域任务的实验证明了TLR优于竞争方法的优势。
translated by 谷歌翻译
Learning domain-invariant features is of vital importance to unsupervised domain adaptation , where classifiers trained on the source domain need to be adapted to a different target domain for which no labeled examples are available. In this paper, we propose a novel approach for learning such features. The central idea is to exploit the existence of landmarks , which are a subset of labeled data instances in the source domain that are distributed most similarly to the target domain. Our approach automatically discovers the landmarks and use them to bridge the source to the target by constructing provably easier auxiliary domain adaptation tasks. The solutions of those auxiliary tasks form the basis to compose invariant features for the original task. We show how this composition can be optimized discriminatively without requiring labels from the target domain. We validate the method on standard benchmark datasets for visual object recognition and sentiment analysis of text. Empirical results show the proposed method outperforms the state-of-the-art significantly.
translated by 谷歌翻译