Although weakly-supervised techniques can reduce the labeling effort, it is unclear whether a saliency model trained with weakly-supervised data (e.g., point annotation) can achieve the equivalent performance of its fully-supervised version. This paper attempts to answer this unexplored question by proving a hypothesis: there is a point-labeled dataset where saliency models trained on it can achieve equivalent performance when trained on the densely annotated dataset. To prove this conjecture, we proposed a novel yet effective adversarial trajectory-ensemble active learning (ATAL). Our contributions are three-fold: 1) Our proposed adversarial attack triggering uncertainty can conquer the overconfidence of existing active learning methods and accurately locate these uncertain pixels. {2)} Our proposed trajectory-ensemble uncertainty estimation method maintains the advantages of the ensemble networks while significantly reducing the computational cost. {3)} Our proposed relationship-aware diversity sampling algorithm can conquer oversampling while boosting performance. Experimental results show that our ATAL can find such a point-labeled dataset, where a saliency model trained on it obtained $97\%$ -- $99\%$ performance of its fully-supervised version with only ten annotated points per image.
translated by 谷歌翻译
当前的最新显着性检测模型在很大程度上依赖于精确的像素注释的大型数据集,但是手动标记像素是时必的且劳动力密集的。有一些用于减轻该问题的弱监督方法,例如图像标签,边界框标签和涂鸦标签,而在该领域仍未探索点标签。在本文中,我们提出了一种使用点监督的新型弱监督的显着对象检测方法。为了推断显着性图,我们首先设计了一种自适应掩盖洪水填充算法以生成伪标签。然后,我们开发了一个基于变压器的点保护显着性检测模型,以产生第一轮显着图。但是,由于标签的稀疏性,弱监督模型倾向于退化为一般​​的前景检测模型。为了解决这个问题,我们提出了一种非征服方法(NSS)方法,以优化第一轮中产生的错误显着图,并利用它们进行第二轮训练。此外,我们通过重新标记DUTS数据集来构建一个新的监督数据集(P-DUTS)。在p-duts中,每个显着对象只有一个标记点​​。在五个最大基准数据集上进行的全面实验表明,我们的方法的表现优于先前的最先进方法,该方法接受了更强的监督,甚至超过了几种完全监督的最先进模型。该代码可在以下网址获得:https://github.com/shuyonggao/psod。
translated by 谷歌翻译
As an important data selection schema, active learning emerges as the essential component when iterating an Artificial Intelligence (AI) model. It becomes even more critical given the dominance of deep neural network based models, which are composed of a large number of parameters and data hungry, in application. Despite its indispensable role for developing AI models, research on active learning is not as intensive as other research directions. In this paper, we present a review of active learning through deep active learning approaches from the following perspectives: 1) technical advancements in active learning, 2) applications of active learning in computer vision, 3) industrial systems leveraging or with potential to leverage active learning for data iteration, 4) current limitations and future research directions. We expect this paper to clarify the significance of active learning in a modern AI model manufacturing process and to bring additional research attention to active learning. By addressing data automation challenges and coping with automated machine learning systems, active learning will facilitate democratization of AI technologies by boosting model production at scale.
translated by 谷歌翻译
完全监督的显着对象检测(SOD)方法取得了长足的进步,但是这种方法通常依赖大量的像素级注释,这些注释耗时且耗时。在本文中,我们专注于混合标签下的新的弱监督SOD任务,其中监督标签包括传统无监督方法生成的大量粗标签和少量的真实标签。为了解决此任务中标签噪声和数量不平衡问题的问题,我们设计了一个新的管道框架,采用三种复杂的培训策略。在模型框架方面,我们将任务分解为标签细化子任务和显着对象检测子任务,它们相互合作并交替训练。具体而言,R-NET设计为配备有指导和聚合机制的搅拌机的两流编码器模型(BGA),旨在纠正更可靠的伪标签的粗标签,而S-NET是可更换的。由当前R-NET生成的伪标签监督的SOD网络。请注意,我们只需要使用训练有素的S-NET进行测试。此外,为了确保网络培训的有效性和效率,我们设计了三种培训策略,包括替代迭代机制,小组智慧的增量机制和信誉验证机制。五个草皮基准的实验表明,我们的方法在定性和定量上都针对弱监督/无监督/无监督的方法实现了竞争性能。
translated by 谷歌翻译
Fully supervised salient object detection (SOD) has made considerable progress based on expensive and time-consuming data with pixel-wise annotations. Recently, to relieve the labeling burden while maintaining performance, some scribble-based SOD methods have been proposed. However, learning precise boundary details from scribble annotations that lack edge information is still difficult. In this paper, we propose to learn precise boundaries from our designed synthetic images and labels without introducing any extra auxiliary data. The synthetic image creates boundary information by inserting synthetic concave regions that simulate the real concave regions of salient objects. Furthermore, we propose a novel self-consistent framework that consists of a global integral branch (GIB) and a boundary-aware branch (BAB) to train a saliency detector. GIB aims to identify integral salient objects, whose input is the original image. BAB aims to help predict accurate boundaries, whose input is the synthetic image. These two branches are connected through a self-consistent loss to guide the saliency detector to predict precise boundaries while identifying salient objects. Experimental results on five benchmarks demonstrate that our method outperforms the state-of-the-art weakly supervised SOD methods and further narrows the gap with the fully supervised methods.
translated by 谷歌翻译
积极学习是一种降低标签成本以构建高质量机器学习模型的既定技术。主动学习的核心组件是确定应选择哪些数据来注释的采集功能。最先进的采集功能 - 更重要的是主动学习技术 - 已经旨在最大限度地提高清洁性能(例如,准确性)并忽视了鲁棒性,这是一种受到越来越受关注的重要品质。因此,主动学习产生准确但不强大的模型。在本文中,我们提出了一种积极的学习过程,集成了对抗性培训的积极学习过程 - 最熟悉的制作强大模型的方法。通过对11个采集函数的实证研究,4个数据集,6个DNN架构和15105培训的DNN,我们表明,强大的主动学习可以产生具有鲁棒性的模型(对抗性示例的准确性),范围从2.35 \%到63.85 \%,而标准主动学习系统地实现了可忽略不计的鲁棒性(小于0.20 \%)。然而,我们的研究还揭示了在稳健性方面,在准确性上表现良好的采集功能比随机抽样更糟糕。因此,我们检查了它背后的原因,并设计了一个新的采购功能,这些功能既可定位清洁的性能和鲁棒性。我们的采集功能 - 基于熵(DRE)的基于密度的鲁棒采样 - 优于鲁棒性的其他采集功能(包括随机),最高可达24.40 \%(特别是3.84 \%),同时仍然存在竞争力准确性。此外,我们证明了DRE适用于测试选择度量,用于模型再培训,并从所有比较功能中脱颖而出,高达8.21%的鲁棒性。
translated by 谷歌翻译
Pixel-wise prediction with deep neural network has become an effective paradigm for salient object detection (SOD) and achieved remarkable performance. However, very few SOD models are robust against adversarial attacks which are visually imperceptible for human visual attention. The previous work robust saliency (ROSA) shuffles the pre-segmented superpixels and then refines the coarse saliency map by the densely connected conditional random field (CRF). Different from ROSA that relies on various pre- and post-processings, this paper proposes a light-weight Learnable Noise (LeNo) to defend adversarial attacks for SOD models. LeNo preserves accuracy of SOD models on both adversarial and clean images, as well as inference speed. In general, LeNo consists of a simple shallow noise and noise estimation that embedded in the encoder and decoder of arbitrary SOD networks respectively. Inspired by the center prior of human visual attention mechanism, we initialize the shallow noise with a cross-shaped gaussian distribution for better defense against adversarial attacks. Instead of adding additional network components for post-processing, the proposed noise estimation modifies only one channel of the decoder. With the deeply-supervised noise-decoupled training on state-of-the-art RGB and RGB-D SOD networks, LeNo outperforms previous works not only on adversarial images but also on clean images, which contributes stronger robustness for SOD. Our code is available at https://github.com/ssecv/LeNo.
translated by 谷歌翻译
通过回顾他们之前看到的类似未腐败的图像,人类的注意力可以直观地适应图像的损坏区域。这种观察结果激发了我们通过考虑清洁的对应物来提高对抗性图像的注意。为了实现这一目标,我们将联想的对抗性学习(aal)介绍进入对抗的学习,以指导选择性攻击。我们为引人注目和攻击(扰动)之间的内在关系作为提高其互动的耦合优化问题。这导致注意反向触发算法,可以有效提高注意力的对抗鲁棒性。我们的方法是通用的,可用于通过简单选择不同的核来解决各种任务,以便为特定攻击选择其他区域的关联注意。实验结果表明,选择性攻击提高了模型的性能。我们表明,与基线相比,我们的方法提高了8.32%对想象成的识别准确性。它还将Pascalvoc的物体检测图提高了2.02%,并在MiniimAgenet上的几次学习识别准确性为1.63%。
translated by 谷歌翻译
公平的积极学习(FAL)利用积极的学习技术来实现有限的数据,并在敏感组之间达到公平性(例如,性别)。但是,FAL尚未解决对抗性攻击对各种安全至关重要的机器学习应用至关重要的影响。观察到这一点,我们介绍了一项新颖的任务,公平的健壮的积极学习(FRAL),整合了常规的FAL和对抗性鲁棒性。弗拉尔(Fral)要求ML模型利用主动学习技术在良性数据上共同实现均衡的绩效,并对群体之间的对抗性攻击进行均衡的鲁棒性。在这项新任务中,以前的FAL方法通常面临无法忍受的计算负担和无效性的问题。因此,我们通过联合不一致(JIN)制定了一种简单而有效的弗拉尔策略。为了有效地找到可以提高弱势组标签的性能和鲁棒性的样品,我们的方法利用了良性和对抗样本以及标准模型和强大模型之间的预测不一致。在不同的数据集和敏感组下进行的广泛实验表明,我们的方法不仅可以在良性样本上实现更公平的性能,而且与现有的活跃学习和FAL基本线相比,在白盒PGD攻击下,我们的方法还获得了更公平的鲁棒性。我们很乐观,弗拉尔将为开发安全,强大的ML研究和应用程序(例如生物识别系统中的面部属性识别)铺平道路。
translated by 谷歌翻译
接受注释较弱的对象探测器是全面监督者的负担得起的替代方案。但是,它们之间仍然存在显着的性能差距。我们建议通过微调预先训练的弱监督检测器来缩小这一差距,并使用``Box-In-box''(bib'(bib)自动从训练集中自动选择了一些完全注销的样品,这是一种新颖的活跃学习专门针对弱势监督探测器的据可查的失败模式而设计的策略。 VOC07和可可基准的实验表明,围嘴表现优于其他活跃的学习技术,并显着改善了基本的弱监督探测器的性能,而每个类别仅几个完全宣布的图像。围嘴达到了完全监督的快速RCNN的97%,在VOC07上仅10%的全已通量图像。在可可(COCO)上,平均每类使用10张全面通量的图像,或同等的训练集的1%,还减少了弱监督检测器和完全监督的快速RCN之间的性能差距(In AP)以上超过70% ,在性能和数据效率之间表现出良好的权衡。我们的代码可在https://github.com/huyvvo/bib上公开获取。
translated by 谷歌翻译
昂贵注释的要求是培训良好的实例细分模型的重大负担。在本文中,我们提出了一个经济活跃的学习环境,称为主动监督实例细分(API),该实例分段(API)从框级注释开始,并迭代地在盒子内划分一个点,并询问它是否属于对象。API的关键是找到最大程度地提高分段准确性的最佳点,以有限的注释预算。我们制定此设置,并提出几种基于不确定性的抽样策略。与其他学习策略相比,使用这些策略开发的模型可以在具有挑战性的MS-Coco数据集上获得一致的性能增长。结果表明,API集成了主动学习和基于点的监督的优势,是标签有效实例分割的有效学习范式。
translated by 谷歌翻译
由于准备点云的标记数据用于训练语义分割网络是一个耗时的过程,因此已经引入了弱监督的方法,以从一小部分数据中学习。这些方法通常是基于对比损失的学习,同时自动从一组稀疏的用户注销标签中得出每个点伪标签。在本文中,我们的关键观察是,选择要注释的样品的选择与这些样品的使用方式一样重要。因此,我们介绍了一种对3D场景进行弱监督分割的方法,该方法将自我训练与主动学习结合在一起。主动学习选择注释点可能会导致训练有素的模型的性能改进,而自我培训则可以有效利用用户提供的标签来学习模型。我们证明我们的方法会导致一种有效的方法,该方法可改善场景细分对以前的作品和基线,同时仅需要少量的用户注释。
translated by 谷歌翻译
While deep learning succeeds in a wide range of tasks, it highly depends on the massive collection of annotated data which is expensive and time-consuming. To lower the cost of data annotation, active learning has been proposed to interactively query an oracle to annotate a small proportion of informative samples in an unlabeled dataset. Inspired by the fact that the samples with higher loss are usually more informative to the model than the samples with lower loss, in this paper we present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss. The core of our approach is a measurement Temporal Output Discrepancy (TOD) that estimates the sample loss by evaluating the discrepancy of outputs given by models at different optimization steps. Our theoretical investigation shows that TOD lower-bounds the accumulated sample loss thus it can be used to select informative unlabeled samples. On basis of TOD, we further develop an effective unlabeled data sampling strategy as well as an unsupervised learning criterion for active learning. Due to the simplicity of TOD, our methods are efficient, flexible, and task-agnostic. Extensive experimental results demonstrate that our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks. In addition, we show that TOD can be utilized to select the best model of potentially the highest testing accuracy from a pool of candidate models.
translated by 谷歌翻译
Although deep learning has made remarkable progress in processing various types of data such as images, text and speech, they are known to be susceptible to adversarial perturbations: perturbations specifically designed and added to the input to make the target model produce erroneous output. Most of the existing studies on generating adversarial perturbations attempt to perturb the entire input indiscriminately. In this paper, we propose ExploreADV, a general and flexible adversarial attack system that is capable of modeling regional and imperceptible attacks, allowing users to explore various kinds of adversarial examples as needed. We adapt and combine two existing boundary attack methods, DeepFool and Brendel\&Bethge Attack, and propose a mask-constrained adversarial attack system, which generates minimal adversarial perturbations under the pixel-level constraints, namely ``mask-constraints''. We study different ways of generating such mask-constraints considering the variance and importance of the input features, and show that our adversarial attack system offers users good flexibility to focus on sub-regions of inputs, explore imperceptible perturbations and understand the vulnerability of pixels/regions to adversarial attacks. We demonstrate our system to be effective based on extensive experiments and user study.
translated by 谷歌翻译
最近,无监督的域适应是一种有效的范例,用于概括深度神经网络到新的目标域。但是,仍有巨大的潜力才能达到完全监督的性能。在本文中,我们提出了一种新颖的主动学习策略,以帮助目标域中的知识转移,有效域适应。我们从观察开始,即当训练(源)和测试(目标)数据来自不同的分布时,基于能量的模型表现出自由能量偏差。灵感来自这种固有的机制,我们经验揭示了一种简单而有效的能源 - 基于能量的采样策略揭示了比需要特定架构或距离计算的现有方法的最有价值的目标样本。我们的算法,基于能量的活动域适应(EADA),查询逻辑数据组,它将域特征和实例不确定性结合到每个选择回合中。同时,通过通过正则化术语对准源域周围的目标数据紧凑的自由能,可以隐含地减少域间隙。通过广泛的实验,我们表明EADA在众所周知的具有挑战性的基准上超越了最先进的方法,具有实质性的改进,使其成为开放世界中的一个有用的选择。代码可在https://github.com/bit-da/eada获得。
translated by 谷歌翻译
深度神经网络(DNNS)最近在许多分类任务中取得了巨大的成功。不幸的是,它们容易受到对抗性攻击的影响,这些攻击会产生对抗性示例,这些示例具有很小的扰动,以欺骗DNN模型,尤其是在模型共享方案中。事实证明,对抗性训练是最有效的策略,它将对抗性示例注入模型训练中,以提高DNN模型的稳健性,以对对抗性攻击。但是,基于现有的对抗性示例的对抗训练无法很好地推广到标准,不受干扰的测试数据。为了在标准准确性和对抗性鲁棒性之间取得更好的权衡,我们提出了一个新型的对抗训练框架,称为潜在边界引导的对抗训练(梯子),该训练(梯子)在潜在的边界引导的对抗性示例上对对手进行对手训练DNN模型。与大多数在输入空间中生成对抗示例的现有方法相反,梯子通过增加对潜在特征的扰动而产生了无数的高质量对抗示例。扰动是沿SVM构建的具有注意机制的决策边界的正常情况进行的。我们从边界场的角度和可视化视图分析了生成的边界引导的对抗示例的优点。与Vanilla DNN和竞争性底线相比,对MNIST,SVHN,CELEBA和CIFAR-10的广泛实验和详细分析验证了梯子在标准准确性和对抗性鲁棒性之间取得更好的权衡方面的有效性。
translated by 谷歌翻译
为了应对对抗性实例的威胁,对抗性培训提供了一种有吸引力的选择,可以通过在线增强的对抗示例中的培训模型提高模型稳健性。然而,大多数现有的对抗训练方法通过强化对抗性示例来侧重于提高鲁棒的准确性,但忽略了天然数据和对抗性实施例之间的增加,导致自然精度急剧下降。为了维持自然和强大的准确性之间的权衡,我们从特征适应的角度缓解了转变,并提出了一种特征自适应对抗训练(FAAT),这些培训(FAAT)跨越自然数据和对抗示例优化类条件特征适应。具体而言,我们建议纳入一类条件鉴别者,以鼓励特征成为(1)类鉴别的和(2)不变导致对抗性攻击的变化。新型的FAAT框架通过在天然和对抗数据中产生具有类似分布的特征来实现自然和强大的准确性之间的权衡,并实现从类鉴别特征特征中受益的更高的整体鲁棒性。在各种数据集上的实验表明,FAAT产生更多辨别特征,并对最先进的方法表现有利。代码在https://github.com/visionflow/faat中获得。
translated by 谷歌翻译
尽管深入学习对监督点云语义细分的成功取得了成功,但获得大规模的逐点手动注释仍然是一个重大挑战。为了减轻巨大的注释负担,我们提出了一个基于区域和多样性的积极学习(REDAL),这是许多深度学习方法的一般框架,旨在自动选择用于标签获取的信息丰富和多样化的子场所。观察到只有一小部分带注释的区域足以通过深度学习的方式理解3D场景,我们使用SoftMax熵,颜色不连续性和结构复杂性来衡量子场所区域的信息。还开发了一种多样性的选择算法,以避免通过在查询批次中选择信息性但相似的区域而产生的多余注释。广泛的实验表明,我们的方法的表现高于先前的活跃学习策略,并且我们达到了90%的全面监督学习,而S3DIS和Semantickitti数据集则需要不到15%和5%的注释。我们的代码可在https://github.com/tsunghan-wu/redal上公开获取。
translated by 谷歌翻译
作为反对攻击的最有效的防御方法之一,对抗性训练倾向于学习包容性的决策边界,以提高深度学习模型的鲁棒性。但是,由于沿对抗方向的边缘的大幅度和不必要的增加,对抗性训练会在自然实例和对抗性示例之间引起严重的交叉,这不利于平衡稳健性和自然准确性之间的权衡。在本文中,我们提出了一种新颖的对抗训练计划,以在稳健性和自然准确性之间进行更好的权衡。它旨在学习一个中度包容的决策边界,这意味着决策边界下的自然示例的边缘是中等的。我们称此方案为中等边缘的对抗训练(MMAT),该方案生成更细粒度的对抗示例以减轻交叉问题。我们还利用了经过良好培训的教师模型的逻辑来指导我们的模型学习。最后,MMAT在Black-Box和White-Box攻击下都可以实现高自然的精度和鲁棒性。例如,在SVHN上,实现了最新的鲁棒性和自然精度。
translated by 谷歌翻译
自我训练具有极大的促进域自适应语义分割,它迭代地在目标域上生成伪标签并删除网络。然而,由于现实分割数据集是高度不平衡的,因此目标伪标签通常偏置到多数类并且基本上嘈杂,导致出错和次优模型。为了解决这个问题,我们提出了一个基于区域的主动学习方法,用于在域移位下进行语义分割,旨在自动查询要标记的图像区域的小分区,同时最大化分割性能。我们的算法,通过区域杂质和预测不确定性(AL-RIPU)的主动学习,介绍了一种新的采集策略,其特征在于图像区域的空间邻接以及预测置信度。我们表明,所提出的基于地区的选择策略比基于图像或基于点的对应物更有效地使用有限预算。同时,我们在源图像上强制在像素和其最近邻居之间的局部预测一致性。此外,我们制定了负面学习损失,以提高目标领域的鉴别表现。广泛的实验表明,我们的方法只需要极少的注释几乎达到监督性能,并且大大优于最先进的方法。
translated by 谷歌翻译