Partial label learning (PLL) is a typical weakly supervised learning, where each sample is associated with a set of candidate labels. The basic assumption of PLL is that the ground-truth label must reside in the candidate set. However, this assumption may not be satisfied due to the unprofessional judgment of the annotators, thus limiting the practical application of PLL. In this paper, we relax this assumption and focus on a more general problem, noisy PLL, where the ground-truth label may not exist in the candidate set. To address this challenging problem, we further propose a novel framework called "Automatic Refinement Network (ARNet)". Our method consists of multiple rounds. In each round, we purify the noisy samples through two key modules, i.e., noisy sample detection and label correction. To guarantee the performance of these modules, we start with warm-up training and automatically select the appropriate correction epoch. Meanwhile, we exploit data augmentation to further reduce prediction errors in ARNet. Through theoretical analysis, we prove that our method is able to reduce the noise level of the dataset and eventually approximate the Bayes optimal classifier. To verify the effectiveness of ARNet, we conduct experiments on multiple benchmark datasets. Experimental results demonstrate that our ARNet is superior to existing state-of-the-art approaches in noisy PLL. Our code will be made public soon.
translated by 谷歌翻译
Partial label learning (PLL) is an important problem that allows each training example to be labeled with a coarse candidate set, which well suits many real-world data annotation scenarios with label ambiguity. Despite the promise, the performance of PLL often lags behind the supervised counterpart. In this work, we bridge the gap by addressing two key research challenges in PLL -- representation learning and label disambiguation -- in one coherent framework. Specifically, our proposed framework PiCO consists of a contrastive learning module along with a novel class prototype-based label disambiguation algorithm. PiCO produces closely aligned representations for examples from the same classes and facilitates label disambiguation. Theoretically, we show that these two components are mutually beneficial, and can be rigorously justified from an expectation-maximization (EM) algorithm perspective. Moreover, we study a challenging yet practical noisy partial label learning setup, where the ground-truth may not be included in the candidate set. To remedy this problem, we present an extension PiCO+ that performs distance-based clean sample selection and learns robust classifiers by a semi-supervised contrastive learning algorithm. Extensive experiments demonstrate that our proposed methods significantly outperform the current state-of-the-art approaches in standard and noisy PLL tasks and even achieve comparable results to fully supervised learning.
translated by 谷歌翻译
部分标签学习(PLL)是一个典型的弱监督学习框架,每个培训实例都与候选标签集相关联,其中只有一个标签是有效的。为了解决PLL问题,通常方法试图通过使用先验知识(例如培训数据的结构信息)或以自训练方式提炼模型输出来对候选人集进行歧义。不幸的是,由于在模型训练的早期阶段缺乏先前的信息或不可靠的预测,这些方法通常无法获得有利的性能。在本文中,我们提出了一个新的针对部分标签学习的框架,该框架具有元客观指导性的歧义(MOGD),该框架旨在通过在小验证集中求解元目标来从设置的候选标签中恢复地面真相标签。具体而言,为了减轻假阳性标签的负面影响,我们根据验证集的元损失重新权重。然后,分类器通过最大程度地减少加权交叉熵损失来训练。通过使用普通SGD优化器的各种深网络可以轻松实现所提出的方法。从理论上讲,我们证明了元目标的收敛属性,并得出了所提出方法的估计误差界限。在各种基准数据集和实际PLL数据集上进行的广泛实验表明,与最先进的方法相比,所提出的方法可以实现合理的性能。
translated by 谷歌翻译
使用嘈杂标签(LNL)学习旨在设计策略来通过减轻模型过度适应嘈杂标签的影响来提高模型性能和概括。 LNL的主要成功在于从大量嘈杂数据中识别尽可能多的干净样品,同时纠正错误分配的嘈杂标签。最近的进步采用了单个样品的预测标签分布来执行噪声验证和嘈杂的标签校正,很容易产生确认偏差。为了减轻此问题,我们提出了邻里集体估计,其中通过将其与其功能空间最近的邻居进行对比,重新估计了候选样本的预测性可靠性。具体而言,我们的方法分为两个步骤:1)邻域集体噪声验证,将所有训练样品分为干净或嘈杂的子集,2)邻里集体标签校正到Relabel嘈杂样品,然后使用辅助技术来帮助进一步的模型优化。 。在四个常用基准数据集(即CIFAR-10,CIFAR-100,Clothing-1M和WebVision-1.0)上进行了广泛的实验,这表明我们提出的方法非常优于最先进的方法。
translated by 谷歌翻译
最近关于使用嘈杂标签的学习的研究通过利用小型干净数据集来显示出色的性能。特别是,基于模型不可知的元学习的标签校正方法进一步提高了性能,通过纠正了嘈杂的标签。但是,标签错误矫予没有保障措施,导致不可避免的性能下降。此外,每个训练步骤都需要至少三个背部传播,显着减慢训练速度。为了缓解这些问题,我们提出了一种强大而有效的方法,可以在飞行中学习标签转换矩阵。采用转换矩阵使分类器对所有校正样本持怀疑态度,这减轻了错误的错误问题。我们还介绍了一个双头架构,以便在单个反向传播中有效地估计标签转换矩阵,使得估计的矩阵紧密地遵循由标签校正引起的移位噪声分布。广泛的实验表明,我们的方法在训练效率方面表现出比现有方法相当或更好的准确性。
translated by 谷歌翻译
在标签 - 噪声学习中,估计过渡矩阵是一个热门话题,因为矩阵在构建统计上一致的分类器中起着重要作用。传统上,从干净的标签到嘈杂的标签(即,清洁标签过渡矩阵(CLTM))已被广泛利用,以通过使用嘈杂的数据来学习干净的标签分类器。该分类器的动机主要是输出贝叶斯的最佳预测标签,在本文中,我们研究以直接建模从贝叶斯最佳标签过渡到嘈杂标签(即贝叶斯标签,贝叶斯标签,是BLTM)),并学习分类器以预测贝叶斯最佳的分类器标签。请注意,只有嘈杂的数据,它不足以估计CLTM或BLTM。但是,贝叶斯最佳标签与干净标签相比,贝叶斯最佳标签的不确定性较小,即,贝叶斯最佳标签的类后代是一热矢量,而干净标签的载体则不是。这使两个优点能够估算BLTM,即(a)一组具有理论上保证的贝叶斯最佳标签的示例可以从嘈杂的数据中收集; (b)可行的解决方案空间要小得多。通过利用优势,我们通过采用深层神经网络来估计BLTM参数,从而更好地概括和出色的分类性能。
translated by 谷歌翻译
Semi-supervised learning based methods are current SOTA solutions to the noisy-label learning problem, which rely on learning an unsupervised label cleaner first to divide the training samples into a labeled set for clean data and an unlabeled set for noise data. Typically, the cleaner is obtained via fitting a mixture model to the distribution of per-sample training losses. However, the modeling procedure is \emph{class agnostic} and assumes the loss distributions of clean and noise samples are the same across different classes. Unfortunately, in practice, such an assumption does not always hold due to the varying learning difficulty of different classes, thus leading to sub-optimal label noise partition criteria. In this work, we reveal this long-ignored problem and propose a simple yet effective solution, named \textbf{C}lass \textbf{P}rototype-based label noise \textbf{C}leaner (\textbf{CPC}). Unlike previous works treating all the classes equally, CPC fully considers loss distribution heterogeneity and applies class-aware modulation to partition the clean and noise data. CPC takes advantage of loss distribution modeling and intra-class consistency regularization in feature space simultaneously and thus can better distinguish clean and noise labels. We theoretically justify the effectiveness of our method by explaining it from the Expectation-Maximization (EM) framework. Extensive experiments are conducted on the noisy-label benchmarks CIFAR-10, CIFAR-100, Clothing1M and WebVision. The results show that CPC consistently brings about performance improvement across all benchmarks. Codes and pre-trained models will be released at \url{https://github.com/hjjpku/CPC.git}.
translated by 谷歌翻译
带有嘈杂标签的训练深神经网络(DNN)实际上是具有挑战性的,因为不准确的标签严重降低了DNN的概括能力。以前的努力倾向于通过识别带有粗糙的小损失标准来减轻嘈杂标签的干扰的嘈杂数据来处理统一的denoising流中的零件或完整数据,而忽略了嘈杂样本的困难是不同的,因此是刚性和统一的。数据选择管道无法很好地解决此问题。在本文中,我们首先提出了一种称为CREMA的粗到精细的稳健学习方法,以分裂和串扰的方式处理嘈杂的数据。在粗糙水平中,干净和嘈杂的集合首先从统计意义上就可信度分开。由于实际上不可能正确对所有嘈杂样本进行分类,因此我们通过对每个样本的可信度进行建模来进一步处理它们。具体而言,对于清洁集,我们故意设计了一种基于内存的调制方案,以动态调整每个样本在训练过程中的历史可信度顺序方面的贡献,从而减轻了错误地分组为清洁集中的嘈杂样本的效果。同时,对于分类为嘈杂集的样品,提出了选择性标签更新策略,以纠正嘈杂的标签,同时减轻校正错误的问题。广泛的实验是基于不同方式的基准,包括图像分类(CIFAR,Clothing1M等)和文本识别(IMDB),具有合成或自然语义噪声,表明CREMA的优势和普遍性。
translated by 谷歌翻译
深度学习在大量大数据的帮助下取得了众多域中的显着成功。然而,由于许多真实情景中缺乏高质量标签,数据标签的质量是一个问题。由于嘈杂的标签严重降低了深度神经网络的泛化表现,从嘈杂的标签(强大的培训)学习是在现代深度学习应用中成为一项重要任务。在本调查中,我们首先从监督的学习角度描述了与标签噪声学习的问题。接下来,我们提供62项最先进的培训方法的全面审查,所有这些培训方法都按照其方法论差异分为五个群体,其次是用于评估其优越性的六种性质的系统比较。随后,我们对噪声速率估计进行深入分析,并总结了通常使用的评估方法,包括公共噪声数据集和评估度量。最后,我们提出了几个有前途的研究方向,可以作为未来研究的指导。所有内容将在https://github.com/songhwanjun/awesome-noisy-labels提供。
translated by 谷歌翻译
对标签噪声的学习是一个至关重要的话题,可以保证深度神经网络的可靠表现。最近的研究通常是指具有模型输出概率和损失值的动态噪声建模,然后分离清洁和嘈杂的样本。这些方法取得了显着的成功。但是,与樱桃挑选的数据不同,现有方法在面对不平衡数据集时通常无法表现良好,这是现实世界中常见的情况。我们彻底研究了这一现象,并指出了两个主要问题,这些问题阻碍了性能,即\ emph {类间损耗分布差异}和\ emph {由于不确定性而引起的误导性预测}。第一个问题是现有方法通常执行类不足的噪声建模。然而,损失分布显示在类失衡下的类别之间存在显着差异,并且类不足的噪声建模很容易与少数族裔类别中的嘈杂样本和样本混淆。第二个问题是指该模型可能会因认知不确定性和不确定性而导致的误导性预测,因此仅依靠输出概率的现有方法可能无法区分自信的样本。受我们的观察启发,我们提出了一个不确定性的标签校正框架〜(ULC)来处理不平衡数据集上的标签噪声。首先,我们执行认识不确定性的班级特异性噪声建模,以识别可信赖的干净样本并精炼/丢弃高度自信的真实/损坏的标签。然后,我们在随后的学习过程中介绍了不确定性,以防止标签噪声建模过程中的噪声积累。我们对几个合成和现实世界数据集进行实验。结果证明了提出的方法的有效性,尤其是在数据集中。
translated by 谷歌翻译
在标签噪声下训练深神网络的能力很有吸引力,因为不完美的注释数据相对便宜。最先进的方法基于半监督学习(SSL),该学习选择小损失示例为清洁,然后应用SSL技术来提高性能。但是,选择步骤主要提供一个中等大小的清洁子集,该子集可俯瞰丰富的干净样品。在这项工作中,我们提出了一个新颖的嘈杂标签学习框架Promix,试图最大程度地提高清洁样品的实用性以提高性能。我们方法的关键是,我们提出了一种匹配的高信心选择技术,该技术选择了那些具有很高置信的示例,并与给定标签进行了匹配的预测。结合小损失选择,我们的方法能够达到99.27的精度,并在检测CIFAR-10N数据集上的干净样品时召回98.22。基于如此大的清洁数据,Promix将最佳基线方法提高了CIFAR-10N的 +2.67%,而CIFAR-100N数据集则提高了 +1.61%。代码和数据可从https://github.com/justherozen/promix获得
translated by 谷歌翻译
深神经网络(DNN)的记忆效果在许多最先进的标签噪声学习方法中起着枢轴作用。为了利用这一财产,通常采用早期停止训练早期优化的伎俩。目前的方法通常通过考虑整个DNN来决定早期停止点。然而,DNN可以被认为是一系列层的组成,并且发现DNN中的后一个层对标签噪声更敏感,而其前同行是非常稳健的。因此,选择整个网络的停止点可以使不同的DNN层对抗彼此影响,从而降低最终性能。在本文中,我们建议将DNN分离为不同的部位,逐步培训它们以解决这个问题。而不是早期停止,它一次列举一个整体DNN,我们最初通过用相对大量的时期优化DNN来训练前DNN层。在培训期间,我们通过使用较少数量的时期使用较少的地层来逐步培训后者DNN层,以抵消嘈杂标签的影响。我们将所提出的方法术语作为渐进式早期停止(PES)。尽管其简单性,与早期停止相比,PES可以帮助获得更有前景和稳定的结果。此外,通过将PE与现有的嘈杂标签培训相结合,我们在图像分类基准上实现了最先进的性能。
translated by 谷歌翻译
Deep Learning with noisy labels is a practically challenging problem in weakly supervised learning. The stateof-the-art approaches "Decoupling" and "Co-teaching+" claim that the "disagreement" strategy is crucial for alleviating the problem of learning with noisy labels. In this paper, we start from a different perspective and propose a robust learning paradigm called JoCoR, which aims to reduce the diversity of two networks during training. Specifically, we first use two networks to make predictions on the same mini-batch data and calculate a joint loss with Co-Regularization for each training example. Then we select small-loss examples to update the parameters of both two networks simultaneously. Trained by the joint loss, these two networks would be more and more similar due to the effect of Co-Regularization. Extensive experimental results on corrupted data from benchmark datasets including MNIST, CIFAR-10, CIFAR-100 and Clothing1M demonstrate that JoCoR is superior to many state-of-the-art approaches for learning with noisy labels.
translated by 谷歌翻译
Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled. Specifically, we propose a beta mixture to estimate this probability and correct the loss by relying on the network prediction (the so-called bootstrapping loss). We further adapt mixup augmentation to drive our approach a step further. Experiments on CIFAR-10/100 and TinyImageNet demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art. Source code is available at https://git.io/fjsvE.
translated by 谷歌翻译
自数据注释(尤其是对于大型数据集)以来,使用嘈杂的标签学习引起了很大的研究兴趣,这可能不可避免地不可避免。最近的方法通过将培训样本分为清洁和嘈杂的集合来求助于半监督的学习问题。然而,这种范式在重标签噪声下容易出现重大变性,因为干净样品的数量太小,无法进行常规方法。在本文中,我们介绍了一个新颖的框架,称为LC-Booster,以在极端噪音下明确处理学习。 LC-Booster的核心思想是将标签校正纳入样品选择中,以便可以通过可靠的标签校正来培训更纯化的样品,从而减轻确认偏差。实验表明,LC-Booster在几个嘈杂标签的基准测试中提高了最先进的结果,包括CIFAR-10,CIFAR-100,CLASTINGING 1M和WEBVISION。值得注意的是,在极端的90 \%噪声比下,LC-Booster在CIFAR-10和CIFAR-100上获得了92.9 \%和48.4 \%的精度,超过了最终方法,较大的边距就超过了最终方法。
translated by 谷歌翻译
样品选择是减轻标签噪声在鲁棒学习中的影响的有效策略。典型的策略通常应用小损失标准来识别干净的样品。但是,这些样本位于决策边界周围,通常会与嘈杂的例子纠缠在一起,这将被此标准丢弃,从而导致概括性能的严重退化。在本文中,我们提出了一种新颖的选择策略,\ textbf {s} elf- \ textbf {f} il \ textbf {t} ering(sft),它利用历史预测中嘈杂的示例的波动来过滤它们,可以过滤它们,这可以是可以过滤的。避免在边界示例中的小损失标准的选择偏置。具体来说,我们介绍了一个存储库模块,该模块存储了每个示例的历史预测,并动态更新以支持随后的学习迭代的选择。此外,为了减少SFT样本选择偏置的累积误差,我们设计了一个正规化术语来惩罚自信的输出分布。通过通过此术语增加错误分类类别的重量,损失函数在轻度条件下标记噪声是可靠的。我们对具有变化噪声类型的三个基准测试并实现了新的最先进的实验。消融研究和进一步分析验证了SFT在健壮学习中选择样本的优点。
translated by 谷歌翻译
Point cloud segmentation is a fundamental task in 3D. Despite recent progress on point cloud segmentation with the power of deep networks, current learning methods based on the clean label assumptions may fail with noisy labels. Yet, class labels are often mislabeled at both instance-level and boundary-level in real-world datasets. In this work, we take the lead in solving the instance-level label noise by proposing a Point Noise-Adaptive Learning (PNAL) framework. Compared to noise-robust methods on image tasks, our framework is noise-rate blind, to cope with the spatially variant noise rate specific to point clouds. Specifically, we propose a point-wise confidence selection to obtain reliable labels from the historical predictions of each point. A cluster-wise label correction is proposed with a voting strategy to generate the best possible label by considering the neighbor correlations. To handle boundary-level label noise, we also propose a variant ``PNAL-boundary " with a progressive boundary label cleaning strategy. Extensive experiments demonstrate its effectiveness on both synthetic and real-world noisy datasets. Even with $60\%$ symmetric noise and high-level boundary noise, our framework significantly outperforms its baselines, and is comparable to the upper bound trained on completely clean data. Moreover, we cleaned the popular real-world dataset ScanNetV2 for rigorous experiment. Our code and data is available at https://github.com/pleaseconnectwifi/PNAL.
translated by 谷歌翻译
Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results from experiments conducted with CIFAR-10, CIFAR-100 and FASHION-MNIST datasets and synthetically generated noisy labels.
translated by 谷歌翻译
标签噪声显着降低了应用中深度模型的泛化能力。有效的策略和方法,\ Texit {例如}重新加权或损失校正,旨在在训练神经网络时缓解标签噪声的负面影响。这些现有的工作通常依赖于预指定的架构并手动调整附加的超参数。在本文中,我们提出了翘曲的概率推断(WARPI),以便在元学习情景中自适应地整理分类网络的培训程序。与确定性模型相比,WARPI通过学习摊销元网络来制定为分层概率模型,这可以解决样本模糊性,因此对严格的标签噪声更加坚固。与直接生成损耗的重量值的现有近似加权功能不同,我们的元网络被学习以估计从登录和标签的输入来估计整流向量,这具有利用躺在它们中的足够信息的能力。这提供了纠正分类网络的学习过程的有效方法,证明了泛化能力的显着提高。此外,可以将整流载体建模为潜在变量并学习元网络,可以无缝地集成到分类网络的SGD优化中。我们在嘈杂的标签上评估了四个强大学习基准的Warpi,并在变体噪声类型下实现了新的最先进的。广泛的研究和分析还展示了我们模型的有效性。
translated by 谷歌翻译
深神经网络(DNN)的记忆效应在最近的标签噪声学习方法中起关键作用。为了利用这种效果,已经广泛采用了基于模型预测的方法,该方法旨在利用DNN在学习的早期阶段以纠正嘈杂标签的效果。但是,我们观察到该模型在标签预测期间会犯错误,从而导致性能不令人满意。相比之下,在学习早期阶段产生的特征表现出更好的鲁棒性。受到这一观察的启发,在本文中,我们提出了一种基于特征嵌入的新方法,用于用标签噪声,称为标签NoissiLution(Lend)。要具体而言,我们首先根据当前的嵌入式特征计算一个相似性矩阵,以捕获训练数据的局部结构。然后,附近标记的数据(\ textIt {i.e。},标签噪声稀释)使错误标记的数据携带的嘈杂的监督信号淹没了,其有效性是由特征嵌入的固有鲁棒性保证的。最后,带有稀释标签的培训数据进一步用于培训强大的分类器。从经验上讲,我们通过将我们的贷款与几种代表性的强大学习方法进行比较,对合成和现实世界嘈杂数据集进行了广泛的实验。结果验证了我们贷款的有效性。
translated by 谷歌翻译