当深层神经网络过于依赖培训数据集中的虚假相关性以解决下游任务时,就会发生快捷学习。先前的工作表明,这如何损害深度学习模型的组成概括能力。为了解决这个问题,我们提出了一种新的方法来减轻不受控制的目标域中的快捷方式学习。我们的方法使用附加的数据集(源域)扩展了训练集,该数据集(源域)是专门设计的,旨在促进学习基本视觉因素的独立表示。我们基于我们明确控制快捷机会以及现实世界目标域的合成目标域的想法。此外,我们分析了源域的不同规格和网络体系结构对组成概括的影响。我们的主要发现是,从源域中利用数据是减轻快捷方式学习的有效方法。通过促进学习表示的不同因素的独立性,网络可以学会仅考虑预测因素,并忽略推断期间潜在的快捷因素。
translated by 谷歌翻译
通过同时对已知类别进行分类并识别未知类别,将图像分类扩展到开放世界设置。尽管常规的OSR方法可以检测到分布(OOD)样本,但它们无法提供说明,表明哪些基本视觉属性(例如,形状,颜色或背景)导致特定样本未知。在这项工作中,我们介绍了一个新的问题设置,该设置将常规OSR推广到一个多属性设置,其中同时识别了多个视觉属性。在这里,不仅可以识别OOD样本,而且可以按其未知属性进行分类。我们提出了简单的常见OSR基线的扩展,以处理这种新颖的情况。我们表明,当培训数据集中存在虚假相关性时,这些基准很容易受到捷径。这导致了OOD性能差,根据我们的实验,这主要是由于预测的置信度得分的意外交叉分类相关性。我们提供了一个经验证据,表明这种行为在合成和现实世界数据集的不同基准之间是一致的。
translated by 谷歌翻译
许多数据集被指定:给定任务存在多个同样可行的解决方案。对于学习单个假设的方法,指定的指定可能是有问题的,因为实现低训练损失的不同功能可以集中在不同的预测特征上,从而在分布数据的数据上产生明显变化的预测。我们提出了Divdis,这是一个简单的两阶段框架,首先通过利用测试分布中的未标记数据来学习多种假设,以实现任务。然后,我们通过使用其他标签的形式或检查功能可视化的形式选择最小的其他监督来选择一个发现的假设之一来消除歧义。我们证明了Divdis找到在图像分类中使用强大特征的假设和自然语言处理问题的能力。
translated by 谷歌翻译
最近证明,接受SGD训练的神经网络优先依赖线性预测的特征,并且可以忽略复杂的,同样可预测的功能。这种简单性偏见可以解释他们缺乏分布(OOD)的鲁棒性。学习任务越复杂,统计工件(即选择偏见,虚假相关性)的可能性就越大比学习的机制更简单。我们证明可以减轻简单性偏差并改善了OOD的概括。我们使用对其输入梯度对齐的惩罚来训练一组类似的模型以不同的方式拟合数据。我们从理论和经验上展示了这会导致学习更复杂的预测模式的学习。 OOD的概括从根本上需要超出I.I.D.示例,例如多个培训环境,反事实示例或其他侧面信息。我们的方法表明,我们可以将此要求推迟到独立的模型选择阶段。我们获得了SOTA的结果,可以在视觉域偏置数据和概括方面进行视觉识别。该方法 - 第一个逃避简单性偏见的方法 - 突出了需要更好地理解和控制深度学习中的归纳偏见。
translated by 谷歌翻译
A grand goal in deep learning research is to learn representations capable of generalizing across distribution shifts. Disentanglement is one promising direction aimed at aligning a models representations with the underlying factors generating the data (e.g. color or background). Existing disentanglement methods, however, rely on an often unrealistic assumption: that factors are statistically independent. In reality, factors (like object color and shape) are correlated. To address this limitation, we propose a relaxed disentanglement criterion - the Hausdorff Factorized Support (HFS) criterion - that encourages a factorized support, rather than a factorial distribution, by minimizing a Hausdorff distance. This allows for arbitrary distributions of the factors over their support, including correlations between them. We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks, even under severe training correlations and correlation shifts, with in parts over +60% in relative improvement over existing disentanglement methods. In addition, we find that leveraging HFS for representation learning can even facilitate transfer to downstream tasks such as classification under distribution shifts. We hope our original approach and positive empirical results inspire further progress on the open problem of robust generalization.
translated by 谷歌翻译
组成零射击学习(CZSL)旨在使用从训练集中的属性对象组成中学到的知识来识别新的构图。先前的作品主要将图像和组合物投影到共同的嵌入空间中,以衡量其兼容性得分。但是,属性和对象都共享上面学到的视觉表示,导致模型利用虚假的相关性和对可见对的偏见。取而代之的是,我们重新考虑CZSL作为分布的概括问题。如果将对象视为域,我们可以学习对象不变的功能,以识别任何对象附加的属性。同样,当识别具有属性为域的对象时,还可以学习属性不变的功能。具体而言,我们提出了一个不变的特征学习框架,以在表示和梯度级别上对齐不同的域,以捕获与任务相关的内在特征。对两个CZSL基准测试的实验表明,所提出的方法显着优于先前的最新方法。
translated by 谷歌翻译
变异因素之间的相关性在现实数据中普遍存在。机器学习算法可能会受益于利用这种相关性,因为它们可以提高噪声数据的预测性能。然而,通常这种相关性不稳定(例如,它们可能在域,数据集或应用程序之间发生变化),我们希望避免利用它们。解剖学方法旨在学习捕获潜伏子空间变化不同因素的表示。常用方法涉及最小化潜伏子空间之间的相互信息,使得每个潜在的底层属性。但是,当属性相关时,这会失败。我们通过强制执行可用属性上的子空间之间的独立性来解决此问题,这允许我们仅删除不导致的依赖性,这些依赖性是由于训练数据中存在的相关结构。我们通过普发的方法实现这一目标,以最小化关于分类变量的子空间之间的条件互信息(CMI)。我们首先在理论上展示了CMI最小化是对高斯数据线性问题的稳健性解剖的良好目标。然后,我们基于MNIST和Celeba在现实世界数据集上应用我们的方法,并表明它会在相关偏移下产生脱屑和强大的模型,包括弱监督设置。
translated by 谷歌翻译
分发班次的稳健性对于部署现实世界中的机器学习模型至关重要。尽管如此必要的,但在定义导致这些变化的潜在机制以及评估跨多个不同的分发班次的稳健性的潜在机制很少。为此,我们介绍了一种框架,可实现各种分布换档的细粒度分析。我们通过评估在合成和现实世界数据集中分为五个类别的19个不同的方法来提供对当前最先进的方法的整体分析。总的来说,我们训练超过85架模型。我们的实验框架可以很容易地扩展到包括新方法,班次和数据集。我们发现,与以前的工作〜\ citep {gulrajani20}不同,该进度已经通过标准的ERM基线进行;特别是,在许多情况下,预先训练和增强(学习或启发式)提供了大的收益。但是,最好的方法在不同的数据集和班次上不一致。
translated by 谷歌翻译
对分布(OOD)数据的概括是人类自然的能力,但对于机器而言挑战。这是因为大多数学习算法强烈依赖于i.i.d.〜对源/目标数据的假设,这在域转移导致的实践中通常会违反。域的概括(DG)旨在通过仅使用源数据进行模型学习来实现OOD的概括。在过去的十年中,DG的研究取得了长足的进步,导致了广泛的方法论,例如,基于域的一致性,元学习,数据增强或合奏学习的方法,仅举几例;还在各个应用领域进行了研究,包括计算机视觉,语音识别,自然语言处理,医学成像和强化学习。在本文中,首次提供了DG中的全面文献综述,以总结过去十年来的发展。具体而言,我们首先通过正式定义DG并将其与其他相关领域(如域适应和转移学习)联系起来来涵盖背景。然后,我们对现有方法和理论进行了彻底的审查。最后,我们通过有关未来研究方向的见解和讨论来总结这项调查。
translated by 谷歌翻译
Neural image classifiers are known to undergo severe performance degradation when exposed to input that exhibits covariate-shift with respect to the training distribution. Successful hand-crafted augmentation pipelines aim at either approximating the expected test domain conditions or to perturb the features that are specific to the training environment. The development of effective pipelines is typically cumbersome, and produce transformations whose impact on the classifier performance are hard to understand and control. In this paper, we show that recent Text-to-Image (T2I) generators' ability to simulate image interventions via natural-language prompts can be leveraged to train more robust models, offering a more interpretable and controllable alternative to traditional augmentation methods. We find that a variety of prompting mechanisms are effective for producing synthetic training data sufficient to achieve state-of-the-art performance in widely-adopted domain-generalization benchmarks and reduce classifiers' dependency on spurious features. Our work suggests that further progress in T2I generation and a tighter integration with other research fields may represent a significant step towards the development of more robust machine learning systems.
translated by 谷歌翻译
以对象表示的学习背后的想法是,自然场景可以更好地建模为对象的组成及其关系,而不是分布式表示形式。可以将这种归纳偏置注入神经网络中,以可能改善具有多个对象的场景中下游任务的系统概括和性能。在本文中,我们在五个常见的多对象数据集上训练最先进的无监督模型,并评估细分指标和下游对象属性预测。此外,我们通过调查单个对象不超出分布的设置(例如,具有看不见的颜色,质地或形状或场景的全局属性)来研究概括和鲁棒性,例如,通过闭塞来改变,裁剪或增加对象的数量。从我们的实验研究中,我们发现以对象为中心的表示对下游任务很有用,并且通常对影响对象的大多数分布转移有用。但是,当分布转移以较低结构化的方式影响输入时,在模型和分布转移的情况下,分割和下游任务性能的鲁棒性可能会有很大差异。
translated by 谷歌翻译
零拍分类问题的大多数现有算法通常依赖于类别之间基于属性的语义关系,以实现新型类别的分类而不观察其任何实例。但是,训练零拍分类模型仍然需要训练数据集中的每个类(甚至是实例)的属性标记,这也是昂贵的。为此,在本文中,我们提出了一个新的问题场景:“我们是否能够为新颖的属性探测器/分类器获得零射击学习,并使用它们自动注释数据集以进行标记效率?”基本上,仅给予一小组探测器,这些探测器都学会了识别一些手动注释的属性(即,所见属性),我们的目标是以零射学学习方式综合新颖属性的探测器。我们所提出的方法,零拍摄的属性(ZSLA),这是我们最好的知识中的第一个,通过应用SET操作首先将所看到的属性分解为基本属性,然后重新组合地解决这一新的研究问题。这些基本属性进入了新颖的属性。进行广泛的实验以验证我们合成探测器的能力,以便准确地捕获新颖性的语义,并与其他基线方法相比,在检测和定位方面表现出优越的性能。此外,在CALTECH-UCSD鸟类-200-2011 DataSet上使用仅32个属性,我们所提出的方法能够合成其他207个新颖的属性,而在由我们合成重新注释的数据集上培训的各种广义零拍分类算法属性探测器能够提供可比性的性能与手动地理注释有关的那些。
translated by 谷歌翻译
Learning models that gracefully handle distribution shifts is central to research on domain generalization, robust optimization, and fairness. A promising formulation is domain-invariant learning, which identifies the key issue of learning which features are domain-specific versus domaininvariant. An important assumption in this area is that the training examples are partitioned into "domains" or "environments". Our focus is on the more common setting where such partitions are not provided. We propose EIIL, a general framework for domain-invariant learning that incorporates Environment Inference to directly infer partitions that are maximally informative for downstream Invariant Learning. We show that EIIL outperforms invariant learning methods on the CMNIST benchmark without using environment labels, and significantly outperforms ERM on worst-group performance in the Waterbirds and CivilComments datasets. Finally, we establish connections between EIIL and algorithmic fairness, which enables EIIL to improve accuracy and calibration in a fair prediction problem.
translated by 谷歌翻译
We study the problem of object recognition for categories for which we have no training examples, a task also called zero-data or zero-shot learning. This situation has hardly been studied in computer vision research, even though it occurs frequently; the world contains tens of thousands of different object classes, and image collections have been formed and suitably annotated for only a few of them. To tackle the problem, we introduce attribute-based classification: Objects are identified based on a high-level description that is phrased in terms of semantic attributes, such as the object's color or shape. Because the identification of each such property transcends the specific learning task at hand, the attribute classifiers can be prelearned independently, for example, from existing image data sets unrelated to the current task. Afterward, new classes can be detected based on their attribute representation, without the need for a new training phase. In this paper, we also introduce a new data set, Animals with Attributes, of over 30,000 images of 50 animal classes, annotated with 85 semantic attributes. Extensive experiments on this and two more data sets show that attribute-based classification indeed is able to categorize images without access to any training images of the target classes.
translated by 谷歌翻译
广义零射击学习(GZSL)旨在培训一个模型,以在某些输出类别在监督学习过程中未知的情况下对数据样本进行分类。为了解决这一具有挑战性的任务,GZSL利用可见的(源)和看不见的(目标)类的语义信息来弥合所见类和看不见的类之间的差距。自引入以来,已经制定了许多GZSL模型。在这篇评论论文中,我们介绍了有关GZSL的全面评论。首先,我们提供了GZSL的概述,包括问题和挑战。然后,我们为GZSL方法介绍了分层分类,并讨论了每个类别中的代表性方法。此外,我们讨论了GZSL的可用基准数据集和应用程序,以及有关研究差距和未来研究方向的讨论。
translated by 谷歌翻译
Graph machine learning has been extensively studied in both academia and industry. Although booming with a vast number of emerging methods and techniques, most of the literature is built on the in-distribution hypothesis, i.e., testing and training graph data are identically distributed. However, this in-distribution hypothesis can hardly be satisfied in many real-world graph scenarios where the model performance substantially degrades when there exist distribution shifts between testing and training graph data. To solve this critical problem, out-of-distribution (OOD) generalization on graphs, which goes beyond the in-distribution hypothesis, has made great progress and attracted ever-increasing attention from the research community. In this paper, we comprehensively survey OOD generalization on graphs and present a detailed review of recent advances in this area. First, we provide a formal problem definition of OOD generalization on graphs. Second, we categorize existing methods into three classes from conceptually different perspectives, i.e., data, model, and learning strategy, based on their positions in the graph machine learning pipeline, followed by detailed discussions for each category. We also review the theories related to OOD generalization on graphs and introduce the commonly used graph datasets for thorough evaluations. Finally, we share our insights on future research directions. This paper is the first systematic and comprehensive review of OOD generalization on graphs, to the best of our knowledge.
translated by 谷歌翻译
The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train more than 12 000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, increased disentanglement does not seem to lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.
translated by 谷歌翻译
深度神经网络已经证明了他们可以从数据中自动提取有意义的功能的能力。但是,在监督学习中,特定于用于培训的数据集的信息,但与手头的任务无关,可以在提取的表示中仍然被编码。该剩余信息引入了特定于域的偏差,削弱了泛化性能。在这项工作中,我们建议将信息分成与任务相关的表示及其互补情境表示。我们提出了一种原始方法,结合对抗特征预测器和循环重建,以解开单域监督案例中的这两个表示。然后,我们将该方法适应无监督的域适应问题,包括训练能够在源域和目标域上执行的模型。特别是,尽管没有训练标签,我们的方法促进了目标领域的解剖学。这使得能够将特定于域的特定任务信息隔离为公共表示。任务特定的表示允许有效地将从源域获取的知识转移到目标域。在单一域案中,我们展示了我们关于信息检索任务的陈述的质量以及由锐化的任务特定陈述引起的泛化效益。然后,我们在几个古典域适应基准上验证所提出的方法,并说明了解除域适应的益处。
translated by 谷歌翻译
尽管能够与过度能力网络概括,但深神经网络通常会学会滥用数据中的虚假偏见而不是使用实际的任务相关信息。由于此类快捷方式仅在收集的数据集中有效,因此由此产生的偏置模型在现实世界的投入上表现不佳,或导致意外的社交影响,例如性别歧视。为了抵消偏差的影响,现有方法可以利用辅助信息,这在实践中很少可获得,或者在训练数据中的无偏见样本中筛选,希望能够充分存在清洁样品。但是,这些关于数据的推定并不总是保证。在本文中,我们提出了通过生成偏差变换〜(CDVG)对比下展,该〜(CDVG)能够在现有的方法中经营,其中现有方法由于未偏置的偏差样品而不足的预设而下降。通过我们的观察,不仅如前所述的鉴别模型,而且生成模型倾向于关注偏差,CDVG使用翻译模型来将样本中的偏置转换为另一种偏差模式,同时保留任务相关信息。 。通过对比学习,我们将转化的偏见视图与另一个学习偏见,学习偏见不变的表示。综合和现实世界数据集的实验结果表明,我们的框架优于目前的最先进,并且有效地阻止模型即使在无偏差样本极为稀缺时也会被偏置。
translated by 谷歌翻译
The well-documented presence of texture bias in modern convolutional neural networks has led to a plethora of algorithms that promote an emphasis on shape cues, often to support generalization to new domains. Yet, common datasets, benchmarks and general model selection strategies are missing, and there is no agreed, rigorous evaluation protocol. In this paper, we investigate difficulties and limitations when training networks with reduced texture bias. In particular, we also show that proper evaluation and meaningful comparisons between methods are not trivial. We introduce BiasBed, a testbed for texture- and style-biased training, including multiple datasets and a range of existing algorithms. It comes with an extensive evaluation protocol that includes rigorous hypothesis testing to gauge the significance of the results, despite the considerable training instability of some style bias methods. Our extensive experiments, shed new light on the need for careful, statistically founded evaluation protocols for style bias (and beyond). E.g., we find that some algorithms proposed in the literature do not significantly mitigate the impact of style bias at all. With the release of BiasBed, we hope to foster a common understanding of consistent and meaningful comparisons, and consequently faster progress towards learning methods free of texture bias. Code is available at https://github.com/D1noFuzi/BiasBed
translated by 谷歌翻译