持续学习(CL)的重点是开发具有适应新环境并学习新技能的算法。近年来,这项非常具有挑战性的任务引起了人们的极大兴趣,新解决方案迅速出现。在本文中,我们提出了一种NVFNET-RDC方法进行连续对象检测。我们的NVFNET-RDC由教师学生组成,并采用重播和功能蒸馏策略。作为第一名解决方案,我们分别在第三个Clvision Challenge Track 2和Track 3上获得了55.94%和54.65%的平均地图。
translated by 谷歌翻译
Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been held in recent years, as they are an excellent opportunity to stimulate research in promising directions. This paper summarizes the ideas, design choices, rules, and results of the challenge held at the 3rd Continual Learning in Computer Vision (CLVision) Workshop at CVPR 2022. The focus of this competition is the complex continual object detection task, which is still underexplored in literature compared to classification tasks. The challenge is based on the challenge version of the novel EgoObjects dataset, a large-scale egocentric object dataset explicitly designed to benchmark continual learning algorithms for egocentric category-/instance-level object understanding, which covers more than 1k unique main objects and 250+ categories in around 100k video frames.
translated by 谷歌翻译
在持续学习的SSLAD-TRACK 3B挑战中,我们提出了与变压器(COLT)继续学习的方法。与卷积神经网络相比,我们发现变压器遭受灾难性遗忘的损失。我们方法的主要原则是用旧知识蒸馏和头部扩展策略装备基于变压器的特征提取器来竞争灾难性的遗忘。在本报告中,我们首先介绍了对象检测的持续学习的整体框架。然后,我们分析了解决我们解决方案中灾难性遗址的关键要素对效果。我们的方法在SSLAD-TRACK 3B挑战测试集上实现70.78映射。
translated by 谷歌翻译
近年来,大规模的深层模型取得了巨大的成功,但巨大的计算复杂性和大规模的存储要求使其在资源限制设备中部署它们是一个巨大的挑战。作为模型压缩和加速度方法,知识蒸馏通过从教师探测器转移黑暗知识有效提高了小型模型的性能。然而,大多数基于蒸馏的检测方法主要模仿近边界盒附近的特征,这遭受了两个限制。首先,它们忽略边界盒外面的有益特征。其次,这些方法模仿一些特征,这些特征被教师探测器被错误地被视为背景。为了解决上述问题,我们提出了一种新颖的特征性 - 丰富的评分(FRS)方法,可以选择改善蒸馏过程中的广义可检测性的重要特征。所提出的方法有效地检索边界盒外面的重要特征,并消除边界盒内的有害特征。广泛的实验表明,我们的方法在基于锚和无锚探测器上实现了出色的性能。例如,具有Reset-50的RetinAnet在Coco2017数据集上达到39.7%,甚至超过基于Reset-101的教师检测器38.9%甚至超过0.8%。
translated by 谷歌翻译
Scaling object taxonomies is one of the important steps toward a robust real-world deployment of recognition systems. We have faced remarkable progress in images since the introduction of the LVIS benchmark. To continue this success in videos, a new video benchmark, TAO, was recently presented. Given the recent encouraging results from both detection and tracking communities, we are interested in marrying those two advances and building a strong large vocabulary video tracker. However, supervisions in LVIS and TAO are inherently sparse or even missing, posing two new challenges for training the large vocabulary trackers. First, no tracking supervisions are in LVIS, which leads to inconsistent learning of detection (with LVIS and TAO) and tracking (only with TAO). Second, the detection supervisions in TAO are partial, which results in catastrophic forgetting of absent LVIS categories during video fine-tuning. To resolve these challenges, we present a simple but effective learning framework that takes full advantage of all available training data to learn detection and tracking while not losing any LVIS categories to recognize. With this new learning scheme, we show that consistent improvements of various large vocabulary trackers are capable, setting strong baseline results on the challenging TAO benchmarks.
translated by 谷歌翻译
表面缺陷检测是工业质量检查最重要的过程之一。基于深度学习的表面缺陷检测方法已显示出巨大的潜力。但是,表现出色的模型通常需要大量的训练数据,并且只能检测出在训练阶段出现的缺陷。当面对少量数据数据时,缺陷检测模型不可避免地会遭受灾难性遗忘和错误分类问题的困扰。为了解决这些问题,本文提出了一个新的知识蒸馏网络,称为双知识对齐网络(DKAN)。提出的DKAN方法遵循预处理的转移学习范式,并设计了用于微调的知识蒸馏框架。具体而言,提出了增量RCNN以实现不同类别的分离稳定特征表示。在此框架下,设计特征知识对齐(FKA)的损失是在类不足的特征图之间设计的,以解决灾难性的遗忘问题,而logit知识对准(LKA)损失在logit分布之间部署以解决错误分类问题。实验已经在递增的几个neu-det数据集上进行,结果表明,DKAN在各种几个场景上的其他方法都优于其他方法,对平均平均精度度量指标最高可达6.65%,这证明了该方法的有效性。
translated by 谷歌翻译
持续深度学习的领域是一个新兴领域,已经取得了很多进步。但是,同时仅根据图像分类的任务进行了大多数方法,这在智能车辆领域无关。直到最近才提出了班级开展语义分割的方法。但是,所有这些方法都是基于某种形式的知识蒸馏。目前,尚未对基于重播的方法进行调查,这些方法通常在连续的环境中用于对象识别。同时,尽管无监督的语义分割的域适应性获得了很多吸引力,但在持续环境中有关域内收入学习的调查并未得到充分研究。因此,我们工作的目的是评估和调整已建立的解决方案,以连续对象识别语义分割任务,并为连续语义分割的任务提供基线方法和评估协议。首先,我们介绍了类和域内的分割的评估协议,并分析了选定的方法。我们表明,语义分割变化的任务的性质在减轻与图像分类相比最有效的方法中最有效。特别是,在课堂学习中,学习知识蒸馏被证明是至关重要的工具,而在域内,学习重播方法是最有效的方法。
translated by 谷歌翻译
知识蒸馏已成功应用于图像分类。然而,物体检测更复杂,大多数知识蒸馏方法都失败了。在本文中,我们指出,在物体检测中,教师和学生的特征在不同的区域变化,特别是在前景和背景中。如果我们同样蒸馏它们,则特征图之间的不均匀差异会对蒸馏产生负面影响。因此,我们提出了焦点和全球蒸馏(FGD)。焦蒸馏分离前景和背景,强迫学生专注于教师的临界像素和渠道。全球蒸馏重建了不同像素之间的关系,并将其从教师转移给学生,弥补了局灶性蒸馏中缺失的全球信息。由于我们的方法仅需要计算特征图上的损失,因此FGD可以应用于各种探测器。我们在不同骨干网上进行各种探测器,结果表明,学生探测器实现了优异的地图改进。例如,基于Reset-50基于RecinAnet,更快的RCNN,Reppoints和Mask RCNN,Coco2017上达到40.7%,42.0%,42.0%和42.1%地图,3.3,3.6,3.4和2.9高于基线,分别。我们的代码可在https://github.com/yzd-v/fgd获得。
translated by 谷歌翻译
由于相似的外观产品及其各种姿势,在人类级别的精度上设计自动结帐系统为零售商店的精度而言具有挑战性。本文通过提出具有两阶段管道的方法来解决问题。第一阶段检测到类不足的项目,第二阶段专门用于对产品类别进行分类。我们还在视频帧中跟踪对象,以避免重复计数。一个主要的挑战是域间隙,因为模型经过合成数据的训练,但对真实图像进行了测试。为了减少误差差距,我们为第一阶段检测器采用域泛化方法。此外,模型集合用于增强第二阶段分类器的鲁棒性。该方法在AI City Challenge 2022 -Track 4上进行了评估,并在测试A集合中获得F1分40美元\%$。代码在链接https://github.com/cybercore-co-ltd/aicity22-track4上发布。
translated by 谷歌翻译
Open world object detection aims at detecting objects that are absent in the object classes of the training data as unknown objects without explicit supervision. Furthermore, the exact classes of the unknown objects must be identified without catastrophic forgetting of the previous known classes when the corresponding annotations of unknown objects are given incrementally. In this paper, we propose a two-stage training approach named Open World DETR for open world object detection based on Deformable DETR. In the first stage, we pre-train a model on the current annotated data to detect objects from the current known classes, and concurrently train an additional binary classifier to classify predictions into foreground or background classes. This helps the model to build an unbiased feature representations that can facilitate the detection of unknown classes in subsequent process. In the second stage, we fine-tune the class-specific components of the model with a multi-view self-labeling strategy and a consistency constraint. Furthermore, we alleviate catastrophic forgetting when the annotations of the unknown classes becomes available incrementally by using knowledge distillation and exemplar replay. Experimental results on PASCAL VOC and MS-COCO show that our proposed method outperforms other state-of-the-art open world object detection methods by a large margin.
translated by 谷歌翻译
语义分割(CSS)的持续学习是一个快速新兴的领域,其中分割模型的功能通过学习新类或新域而逐渐改善。持续学习中的一个核心挑战是克服灾难性遗忘的影响,这是指在模型对新类或领域进行培训后,准确性突然下降了先前学习的任务。在持续分类中,通常通过重播以前任务中的少量样本来克服这种挑战,但是在CSS中很少考虑重播。因此,我们研究了各种重播策略对语义细分的影响,并在类和域内的环境中评估它们。我们的发现表明,在课堂开发环境中,至关重要的是,对于缓冲区中不同类别的不同类别的分布至关重要,以避免对新学习的班级产生偏见。在域内营养设置中,通过从学习特征表示的分布或通过中位熵选择样品来选择缓冲液样品是最有效的。最后,我们观察到,有效的抽样方法有助于减少早期层中的表示形式的变化,这是忘记域内收入学习的主要原因。
translated by 谷歌翻译
在真实世界的环境中,可以通过对象检测器连续遇到来自新类的对象实例。当现有的对象探测器应用于这种情况时,它们在旧课程上的性能显着恶化。据报道,一些努力解决了这个限制,所有这些限制适用于知识蒸馏的变体,以避免灾难性的遗忘。我们注意到虽然蒸馏有助于保留以前的学习,但它阻碍了对新任务的快速适应性,这是增量学习的关键要求。在这种追求中,我们提出了一种学习方法,可以学习重塑模型梯度,使得跨增量任务的信息是最佳的共享。这可通过META学习梯度预处理来确保无缝信息传输,可最大限度地减少遗忘并最大化知识传输。与现有的元学习方法相比,我们的方法是任务不可知,允许将新类的增量添加到对象检测的高容量模型中。我们在Pascal-VOC和MS Coco Datasets上定义的各种增量学习设置中评估了我们的方法,我们的方法对最先进的方法进行了好评。
translated by 谷歌翻译
General Continual Learning (GCL) aims at learning from non independent and identically distributed stream data without catastrophic forgetting of the old tasks that don't rely on task boundaries during both training and testing stages. We reveal that the relation and feature deviations are crucial problems for catastrophic forgetting, in which relation deviation refers to the deficiency of the relationship among all classes in knowledge distillation, and feature deviation refers to indiscriminative feature representations. To this end, we propose a Complementary Calibration (CoCa) framework by mining the complementary model's outputs and features to alleviate the two deviations in the process of GCL. Specifically, we propose a new collaborative distillation approach for addressing the relation deviation. It distills model's outputs by utilizing ensemble dark knowledge of new model's outputs and reserved outputs, which maintains the performance of old tasks as well as balancing the relationship among all classes. Furthermore, we explore a collaborative self-supervision idea to leverage pretext tasks and supervised contrastive learning for addressing the feature deviation problem by learning complete and discriminative features for all classes. Extensive experiments on four popular datasets show that our CoCa framework achieves superior performance against state-of-the-art methods. Code is available at https://github.com/lijincm/CoCa.
translated by 谷歌翻译
Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning -- a setting where not all the data samples are labeled. An underlying issue in this scenario is the model forgetting representations of unlabeled data and overfitting the labeled ones. We leverage the power of nearest-neighbor classifiers to non-linearly partition the feature space and learn a strong representation for the current task, as well as distill relevant information from previous tasks. We perform a thorough experimental evaluation and show that our method outperforms all the existing approaches by large margins, setting a strong state of the art on the continual semi-supervised learning paradigm. For example, on CIFAR100 we surpass several others even when using at least 30 times less supervision (0.8% vs. 25% of annotations).
translated by 谷歌翻译
Recently, large-scale pre-trained models have shown their advantages in many tasks. However, due to the huge computational complexity and storage requirements, it is challenging to apply the large-scale model to real scenes. A common solution is knowledge distillation which regards the large-scale model as a teacher model and helps to train a small student model to obtain a competitive performance. Cross-task Knowledge distillation expands the application scenarios of the large-scale pre-trained model. Existing knowledge distillation works focus on directly mimicking the final prediction or the intermediate layers of the teacher model, which represent the global-level characteristics and are task-specific. To alleviate the constraint of different label spaces, capturing invariant intrinsic local object characteristics (such as the shape characteristics of the leg and tail of the cattle and horse) plays a key role. Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios. First, to better transfer the generalized knowledge in the teacher model in cross-task scenarios, we propose a prototype learning module to learn from the essential feature representation of objects in the teacher model. Secondly, for diverse downstream tasks, we propose a task-adaptive feature augmentation module to enhance the features of the student model with the learned generalization prototype features and guide the training of the student model to improve its generalization ability. The experimental results on various visual tasks demonstrate the effectiveness of our approach for large-scale model cross-task knowledge distillation scenes.
translated by 谷歌翻译
持续学习(CL)旨在开发单一模型适应越来越多的任务的技术,从而潜在地利用跨任务的学习以资源有效的方式。 CL系统的主要挑战是灾难性的遗忘,在学习新任务时忘记了早期的任务。为了解决此问题,基于重播的CL方法在遇到遇到任务中选择的小缓冲区中维护和重复培训。我们提出梯度Coreset重放(GCR),一种新颖的重播缓冲区选择和使用仔细设计的优化标准的更新策略。具体而言,我们选择并维护一个“Coreset”,其与迄今为止关于当前模型参数的所有数据的梯度紧密近似,并讨论其有效应用于持续学习设置所需的关键策略。在学习的离线持续学习环境中,我们在最先进的最先进的最先进的持续学习环境中表现出显着的收益(2%-4%)。我们的调查结果还有效地转移到在线/流媒体CL设置,从而显示现有方法的5%。最后,我们展示了持续学习的监督对比损失的价值,当与我们的子集选择策略相结合时,累计增益高达5%。
translated by 谷歌翻译
卷积神经网络在分类方面表现出了显着的结果,但在即时学习新事物方面挣扎。我们提出了一种新颖的彩排方法,其中深度神经网络正在不断学习新的看不见的对象类别,而无需保存任何先前序列的数据。我们的方法称为召回,因为网络通过在培训新类别之前计算旧类别的逻辑来回忆类别。然后在培训期间使用这些,以避免更改旧类别。对于每个新序列,都会添加一个新的头部以适应新类别。为了减轻遗忘,我们提出了一种正规化策略,在该策略中我们用回归替换分类。此外,对于已知类别,我们提出了一个玛哈拉氏症损失,其中包括差异,以说明已知类别和未知类别之间的密度变化。最后,我们提供了一个用于持续学习的新颖数据集,尤其是适用于移动机器人(Hows-CL-25)上的对象识别的数据集,其中包括25个家庭对象类别的150,795个合成图像。我们的方法回忆起优于Core50和ICIFAR-100上的艺术现状,并在HOWS-CL-25上取得了最佳性能。
translated by 谷歌翻译
在其他计算机视觉任务中,深入学习导致对象检测和实例分割的最近进步。这些进步导致广泛的基于学习方法和相关方法的广泛应用于卫星图像的对象检测任务中。在本文中,我们介绍了MIS检查水坝,从卫星图像中的卫星图像进行新数据集,用于构建用于检查和映射的自动化系统,专注于用于农业的灌溉结构的重要性。我们审查了一些最新的对象检测和实例分段方法,并在我们的新数据集中评估其性能。我们根据各种网络配置和骨干架构评估了几个基于单级,两阶段和注意的方法。数据集和预训练型号可在https://www.cse.iitb.ac.in.in/gramdridisti/上获得。
translated by 谷歌翻译
传统的检测网络通常需要丰富的标记训练样本,而人类可以只有几个例子逐步学习新概念。本文侧重于更具挑战性,而是逼真的类渐进的少量对象检测问题(IFSD)。它旨在逐渐逐渐地将新型对象的模型转移到几个注释的样本中,而不会灾难性地忘记先前学识的样本。为了解决这个问题,我们提出了一种新的方法,最小的方法可以减少遗忘,更少的培训资源和更强的转移能力。具体而言,我们首先介绍转移策略,以减少不必要的重量适应并改善IFSD的传输能力。在此基础上,我们使用较少的资源消耗方法整合知识蒸馏技术来缓解遗忘,并提出基于新的基于聚类的示例选择过程,以保持先前学习的更多辨别特征。作为通用且有效的方法,最多可以在很大程度上提高各种基准测试的IFSD性能。
translated by 谷歌翻译
知识蒸馏(KD)显示了其对象检测的有效性,在AI知识(教师检测器)和人类知识(人类专家)的监督下,它在该物体检测中训练紧凑的对象检测器。但是,现有研究一致地对待AI知识和人类知识,并在学习过程中采用统一的数据增强策略,这将导致对多尺度对象的学习有偏见,并且对教师探测器的学习不足,从而导致不满意的蒸馏性能。为了解决这些问题,我们提出了特定于样本的数据增强和对抗性功能增强。首先,为了减轻多尺度对象产生的影响,我们根据傅立叶角度的观察结果提出了自适应数据增强。其次,我们提出了一种基于对抗性示例的功能增强方法,以更好地模仿AI知识以弥补教师探测器的信息不足。此外,我们提出的方法是统一的,并且很容易扩展到其他KD方法。广泛的实验证明了我们的框架的有效性,并在一阶段和两阶段探测器中提高了最先进方法的性能,最多可以带来0.5 MAP的增长。
translated by 谷歌翻译