经过标准的横向损失训练的深度神经网络更容易记住嘈杂的标签,从而降低了其性能。当嘈杂的标签干预时,使用互补标签的负面学习更加健壮,但模型收敛速度极慢。在本文中,我们首先引入了双向学习方案,在这种方案中,积极的学习可确保收敛速度,而负面学习则可以与标签噪声保持稳健的应对。此外,提出了一种动态样本重新加权策略,以通过利用负面学习对样本概率分布的出色歧视能力来削弱噪声标记样品的影响。此外,我们结合了自我鉴定,以进一步提高模型性能。该代码可在\ url {https://github.com/chenchenzong/bldr}中获得。
translated by 谷歌翻译
在标签噪声下训练深神网络的能力很有吸引力,因为不完美的注释数据相对便宜。最先进的方法基于半监督学习(SSL),该学习选择小损失示例为清洁,然后应用SSL技术来提高性能。但是,选择步骤主要提供一个中等大小的清洁子集,该子集可俯瞰丰富的干净样品。在这项工作中,我们提出了一个新颖的嘈杂标签学习框架Promix,试图最大程度地提高清洁样品的实用性以提高性能。我们方法的关键是,我们提出了一种匹配的高信心选择技术,该技术选择了那些具有很高置信的示例,并与给定标签进行了匹配的预测。结合小损失选择,我们的方法能够达到99.27的精度,并在检测CIFAR-10N数据集上的干净样品时召回98.22。基于如此大的清洁数据,Promix将最佳基线方法提高了CIFAR-10N的 +2.67%,而CIFAR-100N数据集则提高了 +1.61%。代码和数据可从https://github.com/justherozen/promix获得
translated by 谷歌翻译
不完美的标签在现实世界数据集中无处不在,严重损害了模型性能。几个最近处理嘈杂标签的有效方法有两个关键步骤:1)将样品分开通过培训丢失,2)使用半监控方法在错误标记的集合中生成样本的伪标签。然而,由于硬样品和噪声之间的类似损失分布,目前的方法总是损害信息性的硬样品。在本文中,我们提出了PGDF(先前引导的去噪框架),通过生成样本的先验知识来学习深层模型来抑制噪声的新框架,这被集成到分割样本步骤和半监督步骤中。我们的框架可以将更多信息性硬清洁样本保存到干净标记的集合中。此外,我们的框架还通过抑制当前伪标签生成方案中的噪声来促进半监控步骤期间伪标签的质量。为了进一步增强硬样品,我们在训练期间在干净的标记集合中重新重量样品。我们使用基于CiFar-10和CiFar-100的合成数据集以及现实世界数据集WebVision和服装1M进行了评估了我们的方法。结果表明了最先进的方法的大量改进。
translated by 谷歌翻译
Annotating the dataset with high-quality labels is crucial for performance of deep network, but in real world scenarios, the labels are often contaminated by noise. To address this, some methods were proposed to automatically split clean and noisy labels, and learn a semi-supervised learner in a Learning with Noisy Labels (LNL) framework. However, they leverage a handcrafted module for clean-noisy label splitting, which induces a confirmation bias in the semi-supervised learning phase and limits the performance. In this paper, we for the first time present a learnable module for clean-noisy label splitting, dubbed SplitNet, and a novel LNL framework which complementarily trains the SplitNet and main network for the LNL task. We propose to use a dynamic threshold based on a split confidence by SplitNet to better optimize semi-supervised learner. To enhance SplitNet training, we also present a risk hedging method. Our proposed method performs at a state-of-the-art level especially in high noise ratio settings on various LNL benchmarks.
translated by 谷歌翻译
Deep neural networks (DNNs) trained on large-scale datasets have exhibited significant performance in image classification. Many large-scale datasets are collected from websites, however they tend to contain inaccurate labels that are termed as noisy labels. Training on such noisy labeled datasets causes performance degradation because DNNs easily overfit to noisy labels. To overcome this problem, we propose a joint optimization framework of learning DNN parameters and estimating true labels. Our framework can correct labels during training by alternating update of network parameters and labels. We conduct experiments on the noisy CIFAR-10 datasets and the Clothing1M dataset.The results indicate that our approach significantly outperforms other state-of-the-art methods.
translated by 谷歌翻译
Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled. Specifically, we propose a beta mixture to estimate this probability and correct the loss by relying on the network prediction (the so-called bootstrapping loss). We further adapt mixup augmentation to drive our approach a step further. Experiments on CIFAR-10/100 and TinyImageNet demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art. Source code is available at https://git.io/fjsvE.
translated by 谷歌翻译
Deep neural networks are known to be annotation-hungry. Numerous efforts have been devoted to reducing the annotation cost when learning with deep networks. Two prominent directions include learning with noisy labels and semi-supervised learning by exploiting unlabeled data. In this work, we propose DivideMix, a novel framework for learning with noisy labels by leveraging semi-supervised learning techniques. In particular, DivideMix models the per-sample loss distribution with a mixture model to dynamically divide the training data into a labeled set with clean samples and an unlabeled set with noisy samples, and trains the model on both the labeled and unlabeled data in a semi-supervised manner. To avoid confirmation bias, we simultaneously train two diverged networks where each network uses the dataset division from the other network. During the semi-supervised training phase, we improve the MixMatch strategy by performing label co-refinement and label co-guessing on labeled and unlabeled samples, respectively. Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods. Code is available at https://github.com/LiJunnan1992/DivideMix.
translated by 谷歌翻译
在嘈杂标记的数据上进行强大的学习是实际应用中的重要任务,因为标签噪声直接导致深度学习模型的概括不良。现有的标签噪声学习方法通​​常假定培训数据的基础类别是平衡的。但是,现实世界中的数据通常是不平衡的,导致观察到的与标签噪声引起的固有类别分布之间的不一致。分布不一致使标签 - 噪声学习的问题更具挑战性,因为很难将干净的样本与内在尾巴类别的嘈杂样本区分开来。在本文中,我们提出了一个学习框架,用于使用内在长尾数据进行标签 - 噪声学习。具体而言,我们提出了一种称为两阶段双维样品选择(TBS)的可靠样品选择方法,以更好地与嘈杂的样品分开清洁样品,尤其是对于尾巴类别。 TBSS由两个新的分离指标组成,以在每个类别中共同分开样本。对具有内在长尾巴分布的多个嘈杂标记的数据集进行了广泛的实验,证明了我们方法的有效性。
translated by 谷歌翻译
自数据注释(尤其是对于大型数据集)以来,使用嘈杂的标签学习引起了很大的研究兴趣,这可能不可避免地不可避免。最近的方法通过将培训样本分为清洁和嘈杂的集合来求助于半监督的学习问题。然而,这种范式在重标签噪声下容易出现重大变性,因为干净样品的数量太小,无法进行常规方法。在本文中,我们介绍了一个新颖的框架,称为LC-Booster,以在极端噪音下明确处理学习。 LC-Booster的核心思想是将标签校正纳入样品选择中,以便可以通过可靠的标签校正来培训更纯化的样品,从而减轻确认偏差。实验表明,LC-Booster在几个嘈杂标签的基准测试中提高了最先进的结果,包括CIFAR-10,CIFAR-100,CLASTINGING 1M和WEBVISION。值得注意的是,在极端的90 \%噪声比下,LC-Booster在CIFAR-10和CIFAR-100上获得了92.9 \%和48.4 \%的精度,超过了最终方法,较大的边距就超过了最终方法。
translated by 谷歌翻译
Training accurate deep neural networks (DNNs) in the presence of noisy labels is an important and challenging task. Though a number of approaches have been proposed for learning with noisy labels, many open issues remain. In this paper, we show that DNN learning with Cross Entropy (CE) exhibits overfitting to noisy labels on some classes ("easy" classes), but more surprisingly, it also suffers from significant under learning on some other classes ("hard" classes). Intuitively, CE requires an extra term to facilitate learning of hard classes, and more importantly, this term should be noise tolerant, so as to avoid overfitting to noisy labels. Inspired by the symmetric KL-divergence, we propose the approach of Symmetric cross entropy Learning (SL), boosting CE symmetrically with a noise robust counterpart Reverse Cross Entropy (RCE). Our proposed SL approach simultaneously addresses both the under learning and overfitting problem of CE in the presence of noisy labels. We provide a theoretical analysis of SL and also empirically show, on a range of benchmark and real-world datasets, that SL outperforms state-of-the-art methods. We also show that SL can be easily incorporated into existing methods in order to further enhance their performance.
translated by 谷歌翻译
带有嘈杂标签的训练深神经网络(DNN)实际上是具有挑战性的,因为不准确的标签严重降低了DNN的概括能力。以前的努力倾向于通过识别带有粗糙的小损失标准来减轻嘈杂标签的干扰的嘈杂数据来处理统一的denoising流中的零件或完整数据,而忽略了嘈杂样本的困难是不同的,因此是刚性和统一的。数据选择管道无法很好地解决此问题。在本文中,我们首先提出了一种称为CREMA的粗到精细的稳健学习方法,以分裂和串扰的方式处理嘈杂的数据。在粗糙水平中,干净和嘈杂的集合首先从统计意义上就可信度分开。由于实际上不可能正确对所有嘈杂样本进行分类,因此我们通过对每个样本的可信度进行建模来进一步处理它们。具体而言,对于清洁集,我们故意设计了一种基于内存的调制方案,以动态调整每个样本在训练过程中的历史可信度顺序方面的贡献,从而减轻了错误地分组为清洁集中的嘈杂样本的效果。同时,对于分类为嘈杂集的样品,提出了选择性标签更新策略,以纠正嘈杂的标签,同时减轻校正错误的问题。广泛的实验是基于不同方式的基准,包括图像分类(CIFAR,Clothing1M等)和文本识别(IMDB),具有合成或自然语义噪声,表明CREMA的优势和普遍性。
translated by 谷歌翻译
Semi-supervised learning based methods are current SOTA solutions to the noisy-label learning problem, which rely on learning an unsupervised label cleaner first to divide the training samples into a labeled set for clean data and an unlabeled set for noise data. Typically, the cleaner is obtained via fitting a mixture model to the distribution of per-sample training losses. However, the modeling procedure is \emph{class agnostic} and assumes the loss distributions of clean and noise samples are the same across different classes. Unfortunately, in practice, such an assumption does not always hold due to the varying learning difficulty of different classes, thus leading to sub-optimal label noise partition criteria. In this work, we reveal this long-ignored problem and propose a simple yet effective solution, named \textbf{C}lass \textbf{P}rototype-based label noise \textbf{C}leaner (\textbf{CPC}). Unlike previous works treating all the classes equally, CPC fully considers loss distribution heterogeneity and applies class-aware modulation to partition the clean and noise data. CPC takes advantage of loss distribution modeling and intra-class consistency regularization in feature space simultaneously and thus can better distinguish clean and noise labels. We theoretically justify the effectiveness of our method by explaining it from the Expectation-Maximization (EM) framework. Extensive experiments are conducted on the noisy-label benchmarks CIFAR-10, CIFAR-100, Clothing1M and WebVision. The results show that CPC consistently brings about performance improvement across all benchmarks. Codes and pre-trained models will be released at \url{https://github.com/hjjpku/CPC.git}.
translated by 谷歌翻译
样品选择是减轻标签噪声在鲁棒学习中的影响的有效策略。典型的策略通常应用小损失标准来识别干净的样品。但是,这些样本位于决策边界周围,通常会与嘈杂的例子纠缠在一起,这将被此标准丢弃,从而导致概括性能的严重退化。在本文中,我们提出了一种新颖的选择策略,\ textbf {s} elf- \ textbf {f} il \ textbf {t} ering(sft),它利用历史预测中嘈杂的示例的波动来过滤它们,可以过滤它们,这可以是可以过滤的。避免在边界示例中的小损失标准的选择偏置。具体来说,我们介绍了一个存储库模块,该模块存储了每个示例的历史预测,并动态更新以支持随后的学习迭代的选择。此外,为了减少SFT样本选择偏置的累积误差,我们设计了一个正规化术语来惩罚自信的输出分布。通过通过此术语增加错误分类类别的重量,损失函数在轻度条件下标记噪声是可靠的。我们对具有变化噪声类型的三个基准测试并实现了新的最先进的实验。消融研究和进一步分析验证了SFT在健壮学习中选择样本的优点。
translated by 谷歌翻译
The core issue in semi-supervised learning (SSL) lies in how to effectively leverage unlabeled data, whereas most existing methods tend to put a great emphasis on the utilization of high-confidence samples yet seldom fully explore the usage of low-confidence samples. In this paper, we aim to utilize low-confidence samples in a novel way with our proposed mutex-based consistency regularization, namely MutexMatch. Specifically, the high-confidence samples are required to exactly predict "what it is" by conventional True-Positive Classifier, while the low-confidence samples are employed to achieve a simpler goal -- to predict with ease "what it is not" by True-Negative Classifier. In this sense, we not only mitigate the pseudo-labeling errors but also make full use of the low-confidence unlabeled data by consistency of dissimilarity degree. MutexMatch achieves superior performance on multiple benchmark datasets, i.e., CIFAR-10, CIFAR-100, SVHN, STL-10, mini-ImageNet and Tiny-ImageNet. More importantly, our method further shows superiority when the amount of labeled data is scarce, e.g., 92.23% accuracy with only 20 labeled data on CIFAR-10. Our code and model weights have been released at https://github.com/NJUyued/MutexMatch4SSL.
translated by 谷歌翻译
基于深度学习的组织病理学图像分类是帮助医生提高癌症诊断的准确性和迅速性的关键技术。然而,在复杂的手动注释过程中,嘈杂的标签通常是不可避免的,因此误导了分类模型的培训。在这项工作中,我们介绍了一种用于组织病理学图像分类的新型硬样本感知噪声稳健学习方法。为了区分来自有害嘈杂的内容漏洞,我们通过使用样本培训历史来构建一个简单/硬/噪声(EHN)检测模型。然后,我们将EHN集成到自动训练架构中,通过逐渐校正降低噪声速率。通过获得的几乎干净的数据集,我们进一步提出了一种噪声抑制和硬增强(NSHE)方案来训练噪声鲁棒模型。与以前的作品相比,我们的方法可以节省更多清洁样本,并且可以直接应用于实际嘈杂的数据集场景,而无需使用清洁子集。实验结果表明,该方案在合成和现实世界嘈杂的数据集中优于当前最先进的方法。源代码和数据可在https://github.com/bupt-ai-cz/hsa-nrl/处获得。
translated by 谷歌翻译
Learning with noisy label (LNL) is a classic problem that has been extensively studied for image tasks, but much less for video in the literature. A straightforward migration from images to videos without considering the properties of videos, such as computational cost and redundant information, is not a sound choice. In this paper, we propose two new strategies for video analysis with noisy labels: 1) A lightweight channel selection method dubbed as Channel Truncation for feature-based label noise detection. This method selects the most discriminative channels to split clean and noisy instances in each category; 2) A novel contrastive strategy dubbed as Noise Contrastive Learning, which constructs the relationship between clean and noisy instances to regularize model training. Experiments on three well-known benchmark datasets for video classification show that our proposed tru{\bf N}cat{\bf E}-split-contr{\bf A}s{\bf T} (NEAT) significantly outperforms the existing baselines. By reducing the dimension to 10\% of it, our method achieves over 0.4 noise detection F1-score and 5\% classification accuracy improvement on Mini-Kinetics dataset under severe noise (symmetric-80\%). Thanks to Noise Contrastive Learning, the average classification accuracy improvement on Mini-Kinetics and Sth-Sth-V1 is over 1.6\%.
translated by 谷歌翻译
在标签 - 噪声学习中,估计过渡矩阵是一个热门话题,因为矩阵在构建统计上一致的分类器中起着重要作用。传统上,从干净的标签到嘈杂的标签(即,清洁标签过渡矩阵(CLTM))已被广泛利用,以通过使用嘈杂的数据来学习干净的标签分类器。该分类器的动机主要是输出贝叶斯的最佳预测标签,在本文中,我们研究以直接建模从贝叶斯最佳标签过渡到嘈杂标签(即贝叶斯标签,贝叶斯标签,是BLTM)),并学习分类器以预测贝叶斯最佳的分类器标签。请注意,只有嘈杂的数据,它不足以估计CLTM或BLTM。但是,贝叶斯最佳标签与干净标签相比,贝叶斯最佳标签的不确定性较小,即,贝叶斯最佳标签的类后代是一热矢量,而干净标签的载体则不是。这使两个优点能够估算BLTM,即(a)一组具有理论上保证的贝叶斯最佳标签的示例可以从嘈杂的数据中收集; (b)可行的解决方案空间要小得多。通过利用优势,我们通过采用深层神经网络来估计BLTM参数,从而更好地概括和出色的分类性能。
translated by 谷歌翻译
深神经网络(DNN)的记忆效果在许多最先进的标签噪声学习方法中起着枢轴作用。为了利用这一财产,通常采用早期停止训练早期优化的伎俩。目前的方法通常通过考虑整个DNN来决定早期停止点。然而,DNN可以被认为是一系列层的组成,并且发现DNN中的后一个层对标签噪声更敏感,而其前同行是非常稳健的。因此,选择整个网络的停止点可以使不同的DNN层对抗彼此影响,从而降低最终性能。在本文中,我们建议将DNN分离为不同的部位,逐步培训它们以解决这个问题。而不是早期停止,它一次列举一个整体DNN,我们最初通过用相对大量的时期优化DNN来训练前DNN层。在培训期间,我们通过使用较少数量的时期使用较少的地层来逐步培训后者DNN层,以抵消嘈杂标签的影响。我们将所提出的方法术语作为渐进式早期停止(PES)。尽管其简单性,与早期停止相比,PES可以帮助获得更有前景和稳定的结果。此外,通过将PE与现有的嘈杂标签培训相结合,我们在图像分类基准上实现了最先进的性能。
translated by 谷歌翻译
Learning with noisy labels is a vital topic for practical deep learning as models should be robust to noisy open-world datasets in the wild. The state-of-the-art noisy label learning approach JoCoR fails when faced with a large ratio of noisy labels. Moreover, selecting small-loss samples can also cause error accumulation as once the noisy samples are mistakenly selected as small-loss samples, they are more likely to be selected again. In this paper, we try to deal with error accumulation in noisy label learning from both model and data perspectives. We introduce mean point ensemble to utilize a more robust loss function and more information from unselected samples to reduce error accumulation from the model perspective. Furthermore, as the flip images have the same semantic meaning as the original images, we select small-loss samples according to the loss values of flip images instead of the original ones to reduce error accumulation from the data perspective. Extensive experiments on CIFAR-10, CIFAR-100, and large-scale Clothing1M show that our method outperforms state-of-the-art noisy label learning methods with different levels of label noise. Our method can also be seamlessly combined with other noisy label learning methods to further improve their performance and generalize well to other tasks. The code is available in https://github.com/zyh-uaiaaaa/MDA-noisy-label-learning.
translated by 谷歌翻译
深度神经网络在大规模标记的数据集的帮助下,在各种任务上取得了出色的表现。然而,这些数据集既耗时又竭尽全力来获得现实的任务。为了减轻对标记数据的需求,通过迭代分配伪标签将伪标签分配给未标记的样本,自我训练被广泛用于半监督学习中。尽管它很受欢迎,但自我训练还是不可靠的,通常会导致训练不稳定。我们的实验研究进一步表明,半监督学习的偏见既来自问题本身,也来自不适当的训练,并具有可能不正确的伪标签,这会在迭代自我训练过程中累积错误。为了减少上述偏见,我们提出了自我训练(DST)。首先,伪标签的生成和利用是由两个独立于参数的分类器头解耦,以避免直接误差积累。其次,我们估计自我训练偏差的最坏情况,其中伪标记函数在标记的样品上是准确的,但在未标记的样本上却尽可能多地犯错。然后,我们通过避免最坏的情况来优化表示形式,以提高伪标签的质量。广泛的实验证明,DST在标准的半监督学习基准数据集上的最先进方法中,平均提高了6.3%,而在13个不同任务上,FIXMATCH的平均水平为18.9%。此外,DST可以无缝地适应其他自我训练方法,并有助于稳定他们在从头开始的培训和预先训练模型的训练的情况下,在培训的情况下进行培训和平衡表现。
translated by 谷歌翻译