translated by 谷歌翻译
人类的持续学习(CL)能力与稳定性与可塑性困境密切相关,描述了人类如何实现持续的学习能力和保存的学习信息。自发育以来,CL的概念始终存在于人工智能(AI)中。本文提出了对CL的全面审查。与之前的评论不同,主要关注CL中的灾难性遗忘现象,本文根据稳定性与可塑性机制的宏观视角来调查CL。类似于生物对应物,“智能”AI代理商应该是I)记住以前学到的信息(信息回流); ii)不断推断新信息(信息浏览:); iii)转移有用的信息(信息转移),以实现高级CL。根据分类学,评估度量,算法,应用以及一些打开问题。我们的主要贡献涉及I)从人工综合情报层面重新检查CL; ii)在CL主题提供详细和广泛的概述; iii)提出一些关于CL潜在发展的新颖思路。
translated by 谷歌翻译
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage.
translated by 谷歌翻译
在线持续学习,尤其是在任务身份和任务边界不可用时,是一个挑战性的持续学习设置。一种代表性的在线持续学习方法是基于重播的方法,其中保留称为内存的重播缓冲区,以保留过去样本的一小部分,以克服灾难性的遗忘。当通过在线持续学习来解决时,大多数现有的基于重播的方法都集中在单标签问题上,其中数据流中的每个样本只有一个标签。但是,在在线持续学习环境中,多标签问题也可能发生,在线持续学习环境中,每个样本可能具有多个标签。在使用多标签样本的在线设置中,数据流中的类分布通常是高度不平衡的,并且在内存中控制类别的分配是一项挑战课程。但是,内存中的课程分布对于基于重播的内存至关重要,以获得良好的性能,尤其是当数据流中的类分布高度不平衡时。在本文中,我们提出了一种简单但有效的方法,称为多标签在线持续学习,称为内存中的班级分布(OCDM)。 OCDM将内存更新机制制定为优化问题,并通过解决此问题来更新内存。在两个广泛使用的多标签数据集上的实验表明,OCDM可以很好地控制内存中的类分布,并且可以胜过其他最先进的方法。
translated by 谷歌翻译
We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs change the underlying training objective to cause less interference with previously learned information. Our proposed version of EBMs for continual learning is simple, efficient, and outperforms baseline methods by a large margin on several benchmarks. Moreover, our proposed contrastive divergence-based training objective can be combined with other continual learning methods, resulting in substantial boosts in their performance. We further show that EBMs are adaptable to a more general continual learning setting where the data distribution changes without the notion of explicitly delineated tasks. These observations point towards EBMs as a useful building block for future continual learning methods.
translated by 谷歌翻译
translated by 谷歌翻译
恶意软件(恶意软件)分类为持续学习(CL)制度提供了独特的挑战,这是由于每天收到的新样本的数量以及恶意软件的发展以利用新漏洞。在典型的一天中,防病毒供应商将获得数十万个独特的软件,包括恶意和良性,并且在恶意软件分类器的一生中,有超过十亿个样品很容易积累。鉴于问题的规模,使用持续学习技术的顺序培训可以在减少培训和存储开销方面提供可观的好处。但是,迄今为止,还没有对CL应用于恶意软件分类任务的探索。在本文中,我们研究了11种应用于三个恶意软件任务的CL技术,涵盖了常见的增量学习方案,包括任务,类和域增量学习(IL)。具体而言,使用两个现实的大规模恶意软件数据集,我们评估了CL方法在二进制恶意软件分类(domain-il)和多类恶意软件家庭分类(Task-IL和类IL)任务上的性能。令我们惊讶的是,在几乎所有情况下,持续的学习方法显着不足以使训练数据的幼稚关节重播 - 在某些情况下,将精度降低了70个百分点以上。与关节重播相比,有选择性重播20%的存储数据的一种简单方法可以实现更好的性能,占训练时间的50%。最后,我们讨论了CL技术表现出乎意料差的潜在原因,希望它激发进一步研究在恶意软件分类域中更有效的技术。
translated by 谷歌翻译
无数据知识蒸馏(DFKD)的目的是在没有培训数据的情况下培训从教师网络的轻量级学生网络。现有方法主要遵循生成信息样本的范式,并通过针对数据先验,边界样本或内存样本来逐步更新学生模型。但是,以前的DFKD方法很难在不同的训练阶段动态调整生成策略,这反过来又很难实现高效且稳定的训练。在本文中,我们探讨了如何从课程学习(CL)的角度来教学学生,并提出一种新方法,即“ CUDFKD”,即“使用课程的无数据知识蒸馏”。它逐渐从简单的样本到困难的样本学习,这类似于人类学习的方式。此外,我们还提供了对主要化最小化(MM)算法的理论分析,并解释了CUDFKD的收敛性。在基准数据集上进行的实验表明,使用简单的课程设计策略,CUDFKD可以在最先进的DFKD方法和不同的基准测试中实现最佳性能,例如CIFAR10上RESNET18模型的95.28 \%TOP1的精度,这是更好的而不是从头开始培训数据。训练很快,在30个时期内达到90 \%的最高精度,并且训练期间的差异稳定。同样在本文中,还分析和讨论了CUDFKD的适用性。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
Developing autonomous vehicles (AVs) helps improve the road safety and traffic efficiency of intelligent transportation systems (ITS). Accurately predicting the trajectories of traffic participants is essential to the decision-making and motion planning of AVs in interactive scenarios. Recently, learning-based trajectory predictors have shown state-of-the-art performance in highway or urban areas. However, most existing learning-based models trained with fixed datasets may perform poorly in continuously changing scenarios. Specifically, they may not perform well in learned scenarios after learning the new one. This phenomenon is called "catastrophic forgetting". Few studies investigate trajectory predictions in continuous scenarios, where catastrophic forgetting may happen. To handle this problem, first, a novel continual learning (CL) approach for vehicle trajectory prediction is proposed in this paper. Then, inspired by brain science, a dynamic memory mechanism is developed by utilizing the measurement of traffic divergence between scenarios, which balances the performance and training efficiency of the proposed CL approach. Finally, datasets collected from different locations are used to design continual training and testing methods in experiments. Experimental results show that the proposed approach achieves consistently high prediction accuracy in continuous scenarios without re-training, which mitigates catastrophic forgetting compared to non-CL approaches. The implementation of the proposed approach is publicly available at https://github.com/BIT-Jack/D-GSM
translated by 谷歌翻译
Many real-world learning scenarios face the challenge of slow concept drift, where data distributions change gradually over time. In this setting, we pose the problem of learning temporally sensitive importance weights for training data, in order to optimize predictive accuracy. We propose a class of temporal reweighting functions that can capture multiple timescales of change in the data, as well as instance-specific characteristics. We formulate a bi-level optimization criterion, and an associated meta-learning algorithm, by which these weights can be learned. In particular, our formulation trains an auxiliary network to output weights as a function of training instances, thereby compactly representing the instance weights. We validate our temporal reweighting scheme on a large real-world dataset of 39M images spread over a 9 year period. Our extensive experiments demonstrate the necessity of instance-based temporal reweighting in the dataset, and achieve significant improvements to classical batch-learning approaches. Further, our proposal easily generalizes to a streaming setting and shows significant gains compared to recent continual learning methods.
translated by 谷歌翻译
Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning -- a setting where not all the data samples are labeled. An underlying issue in this scenario is the model forgetting representations of unlabeled data and overfitting the labeled ones. We leverage the power of nearest-neighbor classifiers to non-linearly partition the feature space and learn a strong representation for the current task, as well as distill relevant information from previous tasks. We perform a thorough experimental evaluation and show that our method outperforms all the existing approaches by large margins, setting a strong state of the art on the continual semi-supervised learning paradigm. For example, on CIFAR100 we surpass several others even when using at least 30 times less supervision (0.8% vs. 25% of annotations).
translated by 谷歌翻译
数十年来,计算机系统持有大量个人数据。一方面,这种数据丰度允许在人工智能(AI),尤其是机器学习(ML)模型中突破。另一方面,它可能威胁用户的隐私并削弱人类与人工智能之间的信任。最近的法规要求,可以从一般情况下从计算机系统中删除有关用户的私人信息,特别是根据要求从ML模型中删除(例如,“被遗忘的权利”)。虽然从后端数据库中删除数据应该很简单,但在AI上下文中,它不够,因为ML模型经常“记住”旧数据。现有的对抗攻击证明,我们可以从训练有素的模型中学习私人会员或培训数据的属性。这种现象要求采用新的范式,即机器学习,以使ML模型忘记了特定的数据。事实证明,由于缺乏共同的框架和资源,最近在机器上学习的工作无法完全解决问题。在本调查文件中,我们试图在其定义,场景,机制和应用中对机器进行彻底的研究。具体而言,作为最先进的研究的类别集合,我们希望为那些寻求机器未学习的入门及其各种表述,设计要求,删除请求,算法和用途的人提供广泛的参考。 ML申请。此外,我们希望概述范式中的关键发现和趋势,并突出显示尚未看到机器无法使用的新研究领域,但仍可以受益匪浅。我们希望这项调查为ML研究人员以及寻求创新隐私技术的研究人员提供宝贵的参考。我们的资源是在https://github.com/tamlhp/awesome-machine-unlearning上。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
Graph learning is a popular approach for performing machine learning on graph-structured data. It has revolutionized the machine learning ability to model graph data to address downstream tasks. Its application is wide due to the availability of graph data ranging from all types of networks to information systems. Most graph learning methods assume that the graph is static and its complete structure is known during training. This limits their applicability since they cannot be applied to problems where the underlying graph grows over time and/or new tasks emerge incrementally. Such applications require a lifelong learning approach that can learn the graph continuously and accommodate new information whilst retaining previously learned knowledge. Lifelong learning methods that enable continuous learning in regular domains like images and text cannot be directly applied to continuously evolving graph data, due to its irregular structure. As a result, graph lifelong learning is gaining attention from the research community. This survey paper provides a comprehensive overview of recent advancements in graph lifelong learning, including the categorization of existing methods, and the discussions of potential applications and open research problems.
translated by 谷歌翻译
Anomaly Detection is a relevant problem that arises in numerous real-world applications, especially when dealing with images. However, there has been little research for this task in the Continual Learning setting. In this work, we introduce a novel approach called SCALE (SCALing is Enough) to perform Compressed Replay in a framework for Anomaly Detection in Continual Learning setting. The proposed technique scales and compresses the original images using a Super Resolution model which, to the best of our knowledge, is studied for the first time in the Continual Learning setting. SCALE can achieve a high level of compression while maintaining a high level of image reconstruction quality. In conjunction with other Anomaly Detection approaches, it can achieve optimal results. To validate the proposed approach, we use a real-world dataset of images with pixel-based anomalies, with the scope to provide a reliable benchmark for Anomaly Detection in the context of Continual Learning, serving as a foundation for further advancements in the field.
translated by 谷歌翻译
我们提出了一种新颖的框架,以便长期分类大型时间序列数据。长时间序号(L-TSC)是一个具有挑战性的问题,因为DATAOFTEN包含大量无关的信息到Checlassification目标。无关期限会降低分类的性能,而系统相关性是未知的。本文提出了一个不确定性,意识到多个型号(MIL)框架,以识别最相关的初期性。预测性的不确定性使得设计指示迫使MIL模型从判别判别期间学习的Anittent机制。此外,预测的不确定性yields是一个原则性的估计器,以识别是否是预测的istrustworthy。我们还通过培训在其可用性上​​进行单独培训并进行不确定性意识融合表的最终预测来培训另一个模型ToAcModate不可靠的预测。系统评估是进行自动识别系统(AIS)数据,该数据是识别和跟踪现实世界船只的COL。经验结果DemonstriteStriteStepriteStepritisht,该方法可以基于轨迹有效地检测血管的阀门,以及与其他可用数据模块(在我们的实验中使用的合成 - ApertureerAdar或SAR图像)的不确定性令人遗憾地进一步改善检测精度。
translated by 谷歌翻译
translated by 谷歌翻译