清洁和不同标记的数据的可用性是培训复杂任务(例如视觉问答(VQA))的培训模型的主要障碍。大型视觉和语言模型的广泛工作表明,自我监督的学习对预处理多模式相互作用有效。在此技术报告中,我们专注于视觉表示。我们审查和评估自我监督的方法,以利用未标记的图像并预处理模型,然后我们对其进行了自定义VQA任务,该任务允许进行控制的评估和诊断。我们将基于能量的模型(EBM)与对比度学习(CL)进行比较。尽管EBM越来越受欢迎,但他们缺乏对下游任务的评估。我们发现,EBM和CL都可以从未标记的图像中学习表示形式,这些图像能够在很少的注释数据上训练VQA模型。在类似于CLEVR的简单设置中,我们发现CL表示还可以改善系统的概括,甚至匹配来自较大,监督,预测模型的表示的性能。但是,我们发现EBM由于不稳定性和结果差异很高而难以训练。尽管EBMS被证明对OOD检测有用,但基于监督的基于能量的训练和不确定性校准的其他结果在很大程度上是负面的。总体而言,CL当前似乎比EBM的选项更为可取。
translated by 谷歌翻译
机器学习模型通常会遇到与训练分布不同的样本。无法识别分布(OOD)样本,因此将该样本分配给课堂标签会显着损害模​​型的可靠性。由于其对在开放世界中的安全部署模型的重要性,该问题引起了重大关注。由于对所有可能的未知分布进行建模的棘手性,检测OOD样品是具有挑战性的。迄今为止,一些研究领域解决了检测陌生样本的问题,包括异常检测,新颖性检测,一级学习,开放式识别识别和分布外检测。尽管有相似和共同的概念,但分别分布,开放式检测和异常检测已被独立研究。因此,这些研究途径尚未交叉授粉,创造了研究障碍。尽管某些调查打算概述这些方法,但它们似乎仅关注特定领域,而无需检查不同领域之间的关系。这项调查旨在在确定其共同点的同时,对各个领域的众多著名作品进行跨域和全面的审查。研究人员可以从不同领域的研究进展概述中受益,并协同发展未来的方法。此外,据我们所知,虽然进行异常检测或单级学习进行了调查,但没有关于分布外检测的全面或最新的调查,我们的调查可广泛涵盖。最后,有了统一的跨域视角,我们讨论并阐明了未来的研究线,打算将这些领域更加紧密地融为一体。
translated by 谷歌翻译
Novelty detection, i.e., identifying whether a given sample is drawn from outside the training distribution, is essential for reliable machine learning. To this end, there have been many attempts at learning a representation well-suited for novelty detection and designing a score based on such representation. In this paper, we propose a simple, yet effective method named contrasting shifted instances (CSI), inspired by the recent success on contrastive learning of visual representations. Specifically, in addition to contrasting a given sample with other instances as in conventional contrastive learning methods, our training scheme contrasts the sample with distributionally-shifted augmentations of itself. Based on this, we propose a new detection score that is specific to the proposed training scheme. Our experiments demonstrate the superiority of our method under various novelty detection scenarios, including unlabeled one-class, unlabeled multi-class and labeled multi-class settings, with various image benchmark datasets. Code and pre-trained models are available at https://github.com/alinlab/CSI.
translated by 谷歌翻译
常规监督学习或分类的主要假设是,测试样本是从与训练样本相同的分布中得出的,该样本称为封闭设置学习或分类。在许多实际情况下,事实并非如此,因为测试数据中有未知数或看不见的类样本,这称为“开放式”方案,需要检测到未知数。该问题称为开放式识别问题,在安全至关重要的应用中很重要。我们建议通过学习成对相似性来检测未知数(或看不见的类样本)。提出的方法分为两个步骤。它首先使用培训中出现的所见类学习了一个封闭的集体分类器,然后学习如何将看到的类与伪单人(自动生成的看不见的类样本)进行比较。伪无表情的一代是通过对可见或训练样品进行分配转换增加而进行的。我们称我们的方法OPG(基于伪看不见的数据生成开放式识别)。实验评估表明,基于相似性的功能可以成功区分基准数据集中的未见特征,以进行开放式识别。
translated by 谷歌翻译
最近证明,接受SGD训练的神经网络优先依赖线性预测的特征,并且可以忽略复杂的,同样可预测的功能。这种简单性偏见可以解释他们缺乏分布(OOD)的鲁棒性。学习任务越复杂,统计工件(即选择偏见,虚假相关性)的可能性就越大比学习的机制更简单。我们证明可以减轻简单性偏差并改善了OOD的概括。我们使用对其输入梯度对齐的惩罚来训练一组类似的模型以不同的方式拟合数据。我们从理论和经验上展示了这会导致学习更复杂的预测模式的学习。 OOD的概括从根本上需要超出I.I.D.示例,例如多个培训环境,反事实示例或其他侧面信息。我们的方法表明,我们可以将此要求推迟到独立的模型选择阶段。我们获得了SOTA的结果,可以在视觉域偏置数据和概括方面进行视觉识别。该方法 - 第一个逃避简单性偏见的方法 - 突出了需要更好地理解和控制深度学习中的归纳偏见。
translated by 谷歌翻译
A recent popular approach to out-of-distribution (OOD) detection is based on a self-supervised learning technique referred to as contrastive learning. There are two main variants of contrastive learning, namely instance and class discrimination, targeting features that can discriminate between different instances for the former, and different classes for the latter. In this paper, we aim to understand the effectiveness and limitation of existing contrastive learning methods for OOD detection. We approach this in 3 ways. First, we systematically study the performance difference between the instance discrimination and supervised contrastive learning variants in different OOD detection settings. Second, we study which in-distribution (ID) classes OOD data tend to be classified into. Finally, we study the spectral decay property of the different contrastive learning approaches and examine how it correlates with OOD detection performance. In scenarios where the ID and OOD datasets are sufficiently different from one another, we see that instance discrimination, in the absence of fine-tuning, is competitive with supervised approaches in OOD detection. We see that OOD samples tend to be classified into classes that have a distribution similar to the distribution of the entire dataset. Furthermore, we show that contrastive learning learns a feature space that contains singular vectors containing several directions with a high variance which can be detrimental or beneficial to OOD detection depending on the inference approach used.
translated by 谷歌翻译
异常检测旨在识别来自正常数据分布的异常情况。该领域已经取得了许多进展,包括创新使用无监督的对比学习。然而,现有方法通常假设清洁训练数据,并且当数据包含未知异常时受限。本文介绍了一种新型半监督异常检测方法,统一了与无监督的对比学习的能源的模型的概念。 ELSA通过基于新能量函数的精心设计的微调步骤灌输对任何数据污染的鲁棒性,这些步骤迫使正常数据分为原型的类别。多种污染方案的实验表明,所提出的模型实现了SOTA性能。广泛的分析还验证了每个组件在所提出的模型中的贡献。除了实验之外,我们还提供了一种理论解释,对何对象学习独自无法检测到数据污染下的异常。
translated by 谷歌翻译
神经网络通常使预测依赖于数据集的虚假相关性,而不是感兴趣的任务的内在特性,面对分布外(OOD)测试数据的急剧下降。现有的De-Bias学习框架尝试通过偏置注释捕获特定的DataSet偏差,它们无法处理复杂的“ood方案”。其他人在低能力偏置模型或损失上隐含地识别数据集偏置,但在训练和测试数据来自相同分布时,它们会降低。在本文中,我们提出了一般的贪婪去偏见学习框架(GGD),它贪婪地训练偏置模型和基础模型,如功能空间中的梯度下降。它鼓励基础模型专注于用偏置模型难以解决的示例,从而仍然在测试阶段中的杂散相关性稳健。 GGD在很大程度上提高了各种任务的模型的泛化能力,但有时会过度估计偏置水平并降低在分配测试。我们进一步重新分析了GGD的集合过程,并将课程正规化为由课程学习启发的GGD,这取得了良好的分配和分发性能之间的权衡。对图像分类的广泛实验,对抗问题应答和视觉问题应答展示了我们方法的有效性。 GGD可以在特定于特定于任务的偏置模型的设置下学习更强大的基础模型,其中具有现有知识和自组合偏置模型而无需先验知识。
translated by 谷歌翻译
Neural networks are often utilised in critical domain applications (e.g. self-driving cars, financial markets, and aerospace engineering), even though they exhibit overconfident predictions for ambiguous inputs. This deficiency demonstrates a fundamental flaw indicating that neural networks often overfit on spurious correlations. To address this problem in this work we present two novel objectives that improve the ability of a network to detect out-of-distribution samples and therefore avoid overconfident predictions for ambiguous inputs. We empirically demonstrate that our methods outperform the baseline and perform better than the majority of existing approaches while still maintaining a competitive performance against the rest. Additionally, we empirically demonstrate the robustness of our approach against common corruptions and demonstrate the importance of regularisation and auxiliary information in out-of-distribution detection.
translated by 谷歌翻译
机器学习已经急剧提高,在多模式任务中缩小了人类的准确性差距,例如视觉问题答案(VQA)。但是,尽管人类在不确定的时候可以说“我不知道”(即避免回答问题),但这种能力在多模式研究中被大大忽略了,尽管此问题对VQA的使用很重要,而VQA实际上使用了VQA。设置。在这项工作中,我们为可靠的VQA提出了一个问题制定,我们更喜欢弃权,而不是提供错误的答案。我们首先为多种VQA模型提供了弃戒功能,并分析了它们的覆盖范围,回答的问题的一部分和风险,该部分的错误。为此,我们探索了几种弃权方法。我们发现,尽管最佳性能模型在VQA V2数据集上实现了超过71%的准确性,但通过直接使用模型的SoftMax得分介绍了弃权的选项,限制了它们的少于8%的问题,以达到错误的错误风险(即1%)。这促使我们利用多模式选择功能直接估计预测答案的正确性,我们显示的可以将覆盖率增加,例如,在1%风险下,2.4倍从6.8%到16.3%。尽管分析覆盖范围和风险很重要,但这些指标具有权衡,这使得比较VQA模型具有挑战性。为了解决这个问题,我们还建议对VQA的有效可靠性指标,与弃权相比,将不正确的答案的成本更大。 VQA的这种新问题制定,度量和分析为构建有效和可靠的VQA模型提供了基础,这些模型具有自我意识,并且只有当他们不知道答案时才戒除。
translated by 谷歌翻译
我们可以在单个网络中训练混合歧视生成模型吗?最近在肯定中回答了这个问题,引入了基于联合能量的模型(JEM)的领域,该模型(JEM)同时达到了高分类的精度和图像生成质量。尽管有最近的进步,但仍存在两个性能差距:标准软磁性分类器的准确性差距,以及最先进的生成模型的发电质量差距。在本文中,我们引入了各种培训技术,以弥合JEM的准确性差距和一代质量差距。 1)我们结合了最近提出的清晰度最小化(SAM)框架来训练JEM,从而促进了能量景观的平滑度和JEM的普遍性。 2)我们将数据扩展排除在JEM的最大似然估计管道中,并减轻数据增强对图像生成质量的负面影响。在多个数据集上进行的广泛实验表明,我们的Sada-Jem在图像分类,图像产生,校准,分布外检测和对抗性鲁棒性方面实现了最先进的表现,并优于JEM JEM。
translated by 谷歌翻译
Determining whether inputs are out-of-distribution (OOD) is an essential building block for safely deploying machine learning models in the open world. However, previous methods relying on the softmax confidence score suffer from overconfident posterior distributions for OOD data. We propose a unified framework for OOD detection that uses an energy score. We show that energy scores better distinguish in-and out-of-distribution samples than the traditional approach using the softmax scores. Unlike softmax confidence scores, energy scores are theoretically aligned with the probability density of the inputs and are less susceptible to the overconfidence issue. Within this framework, energy can be flexibly used as a scoring function for any pre-trained neural classifier as well as a trainable cost function to shape the energy surface explicitly for OOD detection. On a CIFAR-10 pre-trained WideResNet, using the energy score reduces the average FPR (at TPR 95%) by 18.03% compared to the softmax confidence score. With energy-based training, our method outperforms the state-of-the-art on common benchmarks.
translated by 谷歌翻译
检测到分布输入对于在现实世界中安全部署机器学习模型至关重要。然而,已知神经网络遭受过度自信的问题,在该问题中,它们对分布和分布的输入的信心异常高。在这项工作中,我们表明,可以通过在训练中实施恒定的向量规范来通过logit归一化(logitnorm)(logitnorm)来缓解此问题。我们的方法是通过分析的激励,即logit的规范在训练过程中不断增加,从而导致过度自信的产出。因此,LogitNorm背后的关键思想是将网络优化期间输出规范的影响解散。通过LogitNorm培训,神经网络在分布数据和分布数据之间产生高度可区分的置信度得分。广泛的实验证明了LogitNorm的优势,在公共基准上,平均FPR95最高为42.30%。
translated by 谷歌翻译
Prompt tuning is a new few-shot transfer learning technique that only tunes the learnable prompt for pre-trained vision and language models such as CLIP. However, existing prompt tuning methods tend to learn spurious or entangled representations, which leads to poor generalization to unseen concepts. Towards non-spurious and efficient prompt learning from limited examples, this paper presents a novel \underline{\textbf{C}}ounterfactual \underline{\textbf{P}}rompt \underline{\textbf{L}}earning (CPL) method for vision and language models, which simultaneously employs counterfactual generation and contrastive learning in a joint optimization framework. Particularly, CPL constructs counterfactual by identifying minimal non-spurious feature change between semantically-similar positive and negative samples that causes concept change, and learns more generalizable prompt representation from both factual and counterfactual examples via contrastive learning. Extensive experiments demonstrate that CPL can obtain superior few-shot performance on different vision and language tasks than previous prompt tuning methods on CLIP. On image classification, we achieve 3.55\% average relative improvement on unseen classes across seven datasets; on image-text retrieval and visual question answering, we gain up to 4.09\% and 25.08\% relative improvements across three few-shot scenarios on unseen test sets respectively.
translated by 谷歌翻译
在对比学习中,最近的进步表现出了出色的表现。但是,绝大多数方法仅限于封闭世界的环境。在本文中,我们通过挖掘开放世界的环境来丰富表示学习的景观,其中新颖阶级的未标记样本自然可以在野外出现。为了弥合差距,我们引入了一个新的学习框架,开放世界的对比学习(Opencon)。Opencon应对已知和新颖阶级学习紧凑的表现的挑战,并促进了一路上的新颖性发现。我们证明了Opencon在挑战基准数据集中的有效性并建立竞争性能。在Imagenet数据集上,Opencon在新颖和总体分类精度上分别胜过当前最佳方法的最佳方法,分别胜过11.9%和7.4%。我们希望我们的工作能为未来的工作打开新的大门,以解决这一重要问题。
translated by 谷歌翻译
Continual learning (CL) learns a sequence of tasks incrementally. There are two popular CL settings, class incremental learning (CIL) and task incremental learning (TIL). A major challenge of CL is catastrophic forgetting (CF). While a number of techniques are already available to effectively overcome CF for TIL, CIL remains to be highly challenging. So far, little theoretical study has been done to provide a principled guidance on how to solve the CIL problem. This paper performs such a study. It first shows that probabilistically, the CIL problem can be decomposed into two sub-problems: Within-task Prediction (WP) and Task-id Prediction (TP). It further proves that TP is correlated with out-of-distribution (OOD) detection, which connects CIL and OOD detection. The key conclusion of this study is that regardless of whether WP and TP or OOD detection are defined explicitly or implicitly by a CIL algorithm, good WP and good TP or OOD detection are necessary and sufficient for good CIL performances. Additionally, TIL is simply WP. Based on the theoretical result, new CIL methods are also designed, which outperform strong baselines in both CIL and TIL settings by a large margin.
translated by 谷歌翻译
在过去的几年中,关于分类,检测和分割问题的3D学习领域取得了重大进展。现有的绝大多数研究都集中在规范的封闭式条件上,忽略了现实世界的内在开放性。这限制了需要管理新颖和未知信号的自主系统的能力。在这种情况下,利用3D数据可以是有价值的资产,因为它传达了有关感应物体和场景几何形状的丰富信息。本文提供了关于开放式3D学习的首次广泛研究。我们介绍了一种新颖的测试床,其设置在类别语义转移方面的难度增加,并且涵盖了内域(合成之间)和跨域(合成对真实)场景。此外,我们研究了相关的分布情况,并开放了2D文献,以了解其最新方法是否以及如何在3D数据上有效。我们广泛的基准测试在同一连贯的图片中定位了几种算法,从而揭示了它们的优势和局限性。我们的分析结果可能是未来量身定制的开放式3D模型的可靠立足点。
translated by 谷歌翻译
预训练的视觉模型(例如,剪辑)在许多下游任务中显示出有希望的零弹性概括,并具有正确设计的文本提示。最近的作品不依赖手工设计的提示,而是使用下游任务的培训数据来学习提示。虽然有效,但针对领域数据的培训却降低了模型的概括能力,使其无法看到新领域。在这项工作中,我们提出了测试时间提示调整(TPT),该方法可以通过单个测试样本即时学习自适应提示。对于图像分类,TPT通过使用置信度选择最小化熵来优化提示,以便模型在每个测试样本的不同增强视图上都具有一致的预测。在评估对自然分布变化的概括时,TPT平均将零击的TOP-1精度提高了3.6%,超过了先前需要其他特定于任务的训练数据的迅速调整方法。在评估看不见类别的跨数据集泛化时,TPT与使用其他培训数据的最先进方法相当。项目页面:https://azshue.github.io/tpt。
translated by 谷歌翻译
分布(OOD)检测对于确保机器学习系统的可靠性和安全性至关重要。例如,在自动驾驶中,我们希望驾驶系统在发现在训练时间中从未见过的异常​​场景或对象时,发出警报并将控件移交给人类,并且无法做出安全的决定。该术语《 OOD检测》于2017年首次出现,此后引起了研究界的越来越多的关注,从而导致了大量开发的方法,从基于分类到基于密度到基于距离的方法。同时,其他几个问题,包括异常检测(AD),新颖性检测(ND),开放式识别(OSR)和离群检测(OD)(OD),在动机和方法方面与OOD检测密切相关。尽管有共同的目标,但这些主题是孤立发展的,它们在定义和问题设定方面的细微差异通常会使读者和从业者感到困惑。在这项调查中,我们首先提出一个称为广义OOD检测的统一框架,该框架涵盖了上述五个问题,即AD,ND,OSR,OOD检测和OD。在我们的框架下,这五个问题可以看作是特殊情况或子任务,并且更容易区分。然后,我们通过总结了他们最近的技术发展来审查这五个领域中的每一个,特别关注OOD检测方法。我们以公开挑战和潜在的研究方向结束了这项调查。
translated by 谷歌翻译
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small-and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
translated by 谷歌翻译