部分标签学习是一种弱监督的学习,每个培训实例都对应于一组候选标签,其中只有一个是正确的。在本文中,我们介绍了一种针对此问题的新型概率方法,与现有方法相比,该方法至少具有三个优势:它简化了训练过程,改善了性能并可以应用于任何深层体系结构。对人工和现实世界数据集进行的实验表明,诺言的表现优于现有方法。
translated by 谷歌翻译
部分标签学习(PLL)是一个典型的弱监督学习框架,每个培训实例都与候选标签集相关联,其中只有一个标签是有效的。为了解决PLL问题,通常方法试图通过使用先验知识(例如培训数据的结构信息)或以自训练方式提炼模型输出来对候选人集进行歧义。不幸的是,由于在模型训练的早期阶段缺乏先前的信息或不可靠的预测,这些方法通常无法获得有利的性能。在本文中,我们提出了一个新的针对部分标签学习的框架,该框架具有元客观指导性的歧义(MOGD),该框架旨在通过在小验证集中求解元目标来从设置的候选标签中恢复地面真相标签。具体而言,为了减轻假阳性标签的负面影响,我们根据验证集的元损失重新权重。然后,分类器通过最大程度地减少加权交叉熵损失来训练。通过使用普通SGD优化器的各种深网络可以轻松实现所提出的方法。从理论上讲,我们证明了元目标的收敛属性,并得出了所提出方法的估计误差界限。在各种基准数据集和实际PLL数据集上进行的广泛实验表明,与最先进的方法相比,所提出的方法可以实现合理的性能。
translated by 谷歌翻译
部分标签学习是一种弱监督的学习,不精确的标签,在这里,每个训练示例,我们都有一组候选标签而不是一个真正的标签。最近,在候选标签集的不同一代模型下提出了部分标签学习的各种方法。然而,这些方法需要在生成模型上具有相对强烈的分布假设。当假设不保持时,理论上不保证该方法的性能。在本文中,我们提出了部分标签对适用权的概念。我们表明,这种适当的部分标签学习框架包括许多以前的部分标签学习设置作为特殊情况。然后,我们派生了统一的分类风险估计。我们证明我们的估算器是通过获取其估计误差绑定的风险态度。最后,我们通过实验验证了算法的有效性。
translated by 谷歌翻译
互补标签学习(CLL)是弱监督的情况下的常见应用。但是,在实际数据集中,CLL遇到了平衡的培训样本,其中一个类的样品的数量明显低于其他类别的样本。不幸的是,现有的CLL方法尚未探索类饮食样本的问题,从而降低了预测准确性,尤其是在不平衡的类中。在本文中,我们提出了一个新颖的问题设置,以允许从类不平衡的互补标签样品中学习以进行多类分类。因此,为了解决这个新的问题,我们提出了一种新的CLL方法,称为加权互补标签学习(WCLL)。提出的方法通过利用类不平衡互补标记的信息来模拟加权的经验风险损失,这也适用于多类不平衡训练样本。此外,提出的方法的估计误差结合是提供理论保证的。最后,我们对广泛使用的基准数据集进行了广泛的实验,以通过将其与现有最新方法进行比较来验证我们的方法的优势。
translated by 谷歌翻译
最近关于使用嘈杂标签的学习的研究通过利用小型干净数据集来显示出色的性能。特别是,基于模型不可知的元学习的标签校正方法进一步提高了性能,通过纠正了嘈杂的标签。但是,标签错误矫予没有保障措施,导致不可避免的性能下降。此外,每个训练步骤都需要至少三个背部传播,显着减慢训练速度。为了缓解这些问题,我们提出了一种强大而有效的方法,可以在飞行中学习标签转换矩阵。采用转换矩阵使分类器对所有校正样本持怀疑态度,这减轻了错误的错误问题。我们还介绍了一个双头架构,以便在单个反向传播中有效地估计标签转换矩阵,使得估计的矩阵紧密地遵循由标签校正引起的移位噪声分布。广泛的实验表明,我们的方法在训练效率方面表现出比现有方法相当或更好的准确性。
translated by 谷歌翻译
深度学习在大量大数据的帮助下取得了众多域中的显着成功。然而,由于许多真实情景中缺乏高质量标签,数据标签的质量是一个问题。由于嘈杂的标签严重降低了深度神经网络的泛化表现,从嘈杂的标签(强大的培训)学习是在现代深度学习应用中成为一项重要任务。在本调查中,我们首先从监督的学习角度描述了与标签噪声学习的问题。接下来,我们提供62项最先进的培训方法的全面审查,所有这些培训方法都按照其方法论差异分为五个群体,其次是用于评估其优越性的六种性质的系统比较。随后,我们对噪声速率估计进行深入分析,并总结了通常使用的评估方法,包括公共噪声数据集和评估度量。最后,我们提出了几个有前途的研究方向,可以作为未来研究的指导。所有内容将在https://github.com/songhwanjun/awesome-noisy-labels提供。
translated by 谷歌翻译
互补标签(CL)只是指示一个示例的不正确类,但是使用CLS学习会导致多类分类器可以预测正确的类。不幸的是,问题设置仅允许每个示例一个CL,这特别限制了其潜力,因为我们的标签可能会轻松地将多个CLS(MCL)识别为一个示例。在本文中,我们提出了一个新颖的问题设置,以允许每个示例的MCL和使用MCL学习的两种方法。首先,我们设计了两个将MCL分解为许多单个CLS的包装器,以便我们可以使用CLS学习任何方法。但是,分解后MCL持有的监督信息在概念上稀释。因此,在第二个方面,我们得出了公正的风险估计器。最小化IT处理每组MCL的整体组合,并具有估计误差的结合。我们进一步改善了第二种方法,以最大程度地减少正确选择的上限。实验表明,以前的方式可以很好地与MCL学习,但后者甚至更好。
translated by 谷歌翻译
Many datasets are biased, namely they contain easy-to-learn features that are highly correlated with the target class only in the dataset but not in the true underlying distribution of the data. For this reason, learning unbiased models from biased data has become a very relevant research topic in the last years. In this work, we tackle the problem of learning representations that are robust to biases. We first present a margin-based theoretical framework that allows us to clarify why recent contrastive losses (InfoNCE, SupCon, etc.) can fail when dealing with biased data. Based on that, we derive a novel formulation of the supervised contrastive loss (epsilon-SupInfoNCE), providing more accurate control of the minimal distance between positive and negative samples. Furthermore, thanks to our theoretical framework, we also propose FairKL, a new debiasing regularization loss, that works well even with extremely biased data. We validate the proposed losses on standard vision datasets including CIFAR10, CIFAR100, and ImageNet, and we assess the debiasing capability of FairKL with epsilon-SupInfoNCE, reaching state-of-the-art performance on a number of biased datasets, including real instances of biases in the wild.
translated by 谷歌翻译
近年来,有监督的深度学习取得了巨大的成功,从大量完全标记的数据中,对预测模型进行了培训。但是,实际上,标记这样的大数据可能非常昂贵,甚至出于隐私原因甚至可能是不可能的。因此,在本文中,我们旨在学习一个无需任何类标签的准确分类器。更具体地说,我们考虑了多组未标记的数据及其类先验的情况,即每个类别的比例。在此问题设置下,我们首先得出了对分类风险的无偏估计量,可以从给定未标记的集合中估算,并理论上分析了学习分类器的概括误差。然后,我们发现获得的分类器往往会导致过度拟合,因为其经验风险在训练过程中呈负面。为了防止过度拟合,我们进一步提出了一个部分风险正规化,该风险正规化在某些级别上保持了未标记的数据集和类方面的部分风险。实验表明,我们的方法有效地减轻了过度拟合和优于从多个未标记集中学习的最先进方法。
translated by 谷歌翻译
Partial Label (PL) learning refers to the task of learning from the partially labeled data, where each training instance is ambiguously equipped with a set of candidate labels but only one is valid. Advances in the recent deep PL learning literature have shown that the deep learning paradigms, e.g., self-training, contrastive learning, or class activate values, can achieve promising performance. Inspired by the impressive success of deep Semi-Supervised (SS) learning, we transform the PL learning problem into the SS learning problem, and propose a novel PL learning method, namely Partial Label learning with Semi-supervised Perspective (PLSP). Specifically, we first form the pseudo-labeled dataset by selecting a small number of reliable pseudo-labeled instances with high-confidence prediction scores and treating the remaining instances as pseudo-unlabeled ones. Then we design a SS learning objective, consisting of a supervised loss for pseudo-labeled instances and a semantic consistency regularization for pseudo-unlabeled instances. We further introduce a complementary regularization for those non-candidate labels to constrain the model predictions on them to be as small as possible. Empirical results demonstrate that PLSP significantly outperforms the existing PL baseline methods, especially on high ambiguity levels. Code available: https://github.com/changchunli/PLSP.
translated by 谷歌翻译
为了减轻二进制分类中培训有效二进制分类器的数据要求,已经提出了许多弱监督的学习设置。其中,当由于隐私,机密性或安全原因无法访问时,使用成对但不是尖标签的一些考虑。然而,作为一对标签表示两个数据点是否共享尖点标签,如果任一点同样可能是正的或负数,则不能容易地收集。因此,在本文中,我们提出了一种名为成对比较(PCOMP)分类的新颖设置,在那里我们只有一对未标记的数据,我们知道一个人比另一个更有可能是积极的。首先,我们提供了PCOMP数据生成过程,通过理论上保证导出了无偏的风险估计器(URE),并进一步提高了URE使用校正功能。其次,我们将PCOMP分类链接到嘈杂的标签学习,通过强加一致性正规化来开发渐进式,并改善它。最后,我们通过实验证明了我们的方法的有效性,这表明PCOMP是一种有价值的,实际上有用的成对监督类型,除了一对标签。
translated by 谷歌翻译
Recent deep networks are capable of memorizing the entire data even when the labels are completely random. To overcome the overfitting on corrupted labels, we propose a novel technique of learning another neural network, called Men-torNet, to supervise the training of the base deep networks, namely, StudentNet. During training, MentorNet provides a curriculum (sample weighting scheme) for StudentNet to focus on the sample the label of which is probably correct. Unlike the existing curriculum that is usually predefined by human experts, MentorNet learns a data-driven curriculum dynamically with StudentNet. Experimental results demonstrate that our approach can significantly improve the generalization performance of deep networks trained on corrupted training data. Notably, to the best of our knowledge, we achieve the best-published result on We-bVision, a large benchmark containing 2.2 million images of real-world noisy labels. The code are at https://github.com/google/mentornet.
translated by 谷歌翻译
We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise. We propose two procedures for loss correction that are agnostic to both application domain and network architecture. They simply amount to at most a matrix inversion and multiplication, provided that we know the probability of each class being corrupted into another. We further show how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and thus providing an end-to-end framework. Extensive experiments on MNIST, IMDB, CIFAR-10, CIFAR-100 and a large scale dataset of clothing images employing a diversity of architectures -stacking dense, convolutional, pooling, dropout, batch normalization, word embedding, LSTM and residual layers -demonstrate the noise robustness of our proposals. Incidentally, we also prove that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise.
translated by 谷歌翻译
Deep Learning with noisy labels is a practically challenging problem in weakly supervised learning. The stateof-the-art approaches "Decoupling" and "Co-teaching+" claim that the "disagreement" strategy is crucial for alleviating the problem of learning with noisy labels. In this paper, we start from a different perspective and propose a robust learning paradigm called JoCoR, which aims to reduce the diversity of two networks during training. Specifically, we first use two networks to make predictions on the same mini-batch data and calculate a joint loss with Co-Regularization for each training example. Then we select small-loss examples to update the parameters of both two networks simultaneously. Trained by the joint loss, these two networks would be more and more similar due to the effect of Co-Regularization. Extensive experimental results on corrupted data from benchmark datasets including MNIST, CIFAR-10, CIFAR-100 and Clothing1M demonstrate that JoCoR is superior to many state-of-the-art approaches for learning with noisy labels.
translated by 谷歌翻译
Learning with noisy labels is one of the hottest problems in weakly-supervised learning. Based on memorization effects of deep neural networks, training on small-loss instances becomes very promising for handling noisy labels. This fosters the state-of-the-art approach "Co-teaching" that cross-trains two deep neural networks using the small-loss trick. However, with the increase of epochs, two networks converge to a consensus and Co-teaching reduces to the self-training MentorNet. To tackle this issue, we propose a robust learning paradigm called Co-teaching+, which bridges the "Update by Disagreement" strategy with the original Co-teaching. First, two networks feed forward and predict all data, but keep prediction disagreement data only. Then, among such disagreement data, each network selects its small-loss data, but back propagates the small-loss data from its peer network and updates its own parameters. Empirical results on benchmark datasets demonstrate that Co-teaching+ is much superior to many state-of-theart methods in the robustness of trained models.
translated by 谷歌翻译
异常检测旨在识别数据点,这些数据点显示了未标记数据集中大多数数据的系统偏差。一个普遍的假设是,可以使用干净的培训数据(没有异常),这在实践中通常会违反。我们提出了一种在存在与广泛模型兼容的未标记异常的情况下训练异常检测器的策略。这个想法是在更新模型参数时将二进制标签共同推断为每个基准(正常与异常)。受到异常暴露的启发(Hendrycks等人,2018年),该暴露考虑合成创建,标记为异常,我们因此使用了两个共享参数的损失的组合:一个用于正常参数,一个用于异常数据。然后,我们对参数和最可能(潜在)标签进行块坐标更新。我们在三个图像数据集,30个表格数据集和视频异常检测基准上使用几个主链模型进行了实验,对基线显示了一致且显着的改进。
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
在标签 - 噪声学习中,估计过渡矩阵是一个热门话题,因为矩阵在构建统计上一致的分类器中起着重要作用。传统上,从干净的标签到嘈杂的标签(即,清洁标签过渡矩阵(CLTM))已被广泛利用,以通过使用嘈杂的数据来学习干净的标签分类器。该分类器的动机主要是输出贝叶斯的最佳预测标签,在本文中,我们研究以直接建模从贝叶斯最佳标签过渡到嘈杂标签(即贝叶斯标签,贝叶斯标签,是BLTM)),并学习分类器以预测贝叶斯最佳的分类器标签。请注意,只有嘈杂的数据,它不足以估计CLTM或BLTM。但是,贝叶斯最佳标签与干净标签相比,贝叶斯最佳标签的不确定性较小,即,贝叶斯最佳标签的类后代是一热矢量,而干净标签的载体则不是。这使两个优点能够估算BLTM,即(a)一组具有理论上保证的贝叶斯最佳标签的示例可以从嘈杂的数据中收集; (b)可行的解决方案空间要小得多。通过利用优势,我们通过采用深层神经网络来估计BLTM参数,从而更好地概括和出色的分类性能。
translated by 谷歌翻译
Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results from experiments conducted with CIFAR-10, CIFAR-100 and FASHION-MNIST datasets and synthetically generated noisy labels.
translated by 谷歌翻译
Training accurate deep neural networks (DNNs) in the presence of noisy labels is an important and challenging task. Though a number of approaches have been proposed for learning with noisy labels, many open issues remain. In this paper, we show that DNN learning with Cross Entropy (CE) exhibits overfitting to noisy labels on some classes ("easy" classes), but more surprisingly, it also suffers from significant under learning on some other classes ("hard" classes). Intuitively, CE requires an extra term to facilitate learning of hard classes, and more importantly, this term should be noise tolerant, so as to avoid overfitting to noisy labels. Inspired by the symmetric KL-divergence, we propose the approach of Symmetric cross entropy Learning (SL), boosting CE symmetrically with a noise robust counterpart Reverse Cross Entropy (RCE). Our proposed SL approach simultaneously addresses both the under learning and overfitting problem of CE in the presence of noisy labels. We provide a theoretical analysis of SL and also empirically show, on a range of benchmark and real-world datasets, that SL outperforms state-of-the-art methods. We also show that SL can be easily incorporated into existing methods in order to further enhance their performance.
translated by 谷歌翻译