Current deep learning classifiers, carry out supervised learning and store class discriminatory information in a set of shared network weights. These weights cannot be easily altered to incrementally learn additional classes, since the classification weights all require retraining to prevent old class information from being lost and also require the previous training data to be present. We present a novel two stage architecture which couples visual feature learning with probabilistic models to represent each class in the form of a Gaussian Mixture Model. By using these independent class representations within our classifier, we outperform a benchmark of an equivalent network with a Softmax head, obtaining increased accuracy for sample sizes smaller than 12 and increased weighted F1 score for 3 imbalanced class profiles in that sample range. When learning new classes our classifier exhibits no catastrophic forgetting issues and only requires the new classes' training images to be present. This enables a database of growing classes over time which can be visually indexed and reasoned over.
translated by 谷歌翻译
Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions, but whenever we expose such systems to this incremental setting, performance drop very quickly. Overcoming this limitation is fundamental as it would allow us to build truly intelligent systems showing stability and plasticity. Secondly, it would allow us to overcome the onerous limitation of retraining these architectures from scratch with the new updated data. In this thesis, we tackle the problem from multiple directions. In a first study, we show that in rehearsal-based techniques (systems that use memory buffer), the quantity of data stored in the rehearsal buffer is a more important factor over the quality of the data. Secondly, we propose one of the early works of incremental learning on ViTs architectures, comparing functional, weight and attention regularization approaches and propose effective novel a novel asymmetric loss. At the end we conclude with a study on pretraining and how it affects the performance in Continual Learning, raising some questions about the effective progression of the field. We then conclude with some future directions and closing remarks.
translated by 谷歌翻译
在开放世界学习中,代理商从一组已知类,检测和管理它不知道的事情,并从非静止数据流中随时间了解它们。开放世界学习与众多其他学习问题不同,本文简要介绍了各种问题之间的关键差异,包括增量学习,广义新奇发现和广义零射击学习。本文规范了各种开放世界学习问题,包括没有标签的开放世界学习。这些开放世界问题可以通过对已知元素的修改来解决,我们提出了一个新的框架,使代理能够组合各种模块用于新颖性检测,新颖性表征,增量学习和实例管理,以从未标记的流学习新类数据以无人监督的方式,调查如何适应一些最先进的技术来符合框架,并使用它们在没有标签问题的情况下为开放世界学习的性能定义七个基线。然后,我们讨论开放世界的学习质量,并分析如何改善实例管理。我们还讨论了没有标签的开放世界学习中发生的一些普遍歧义问题。
translated by 谷歌翻译
人类的持续学习(CL)能力与稳定性与可塑性困境密切相关,描述了人类如何实现持续的学习能力和保存的学习信息。自发育以来,CL的概念始终存在于人工智能(AI)中。本文提出了对CL的全面审查。与之前的评论不同,主要关注CL中的灾难性遗忘现象,本文根据稳定性与可塑性机制的宏观视角来调查CL。类似于生物对应物,“智能”AI代理商应该是I)记住以前学到的信息(信息回流); ii)不断推断新信息(信息浏览:); iii)转移有用的信息(信息转移),以实现高级CL。根据分类学,评估度量,算法,应用以及一些打开问题。我们的主要贡献涉及I)从人工综合情报层面重新检查CL; ii)在CL主题提供详细和广泛的概述; iii)提出一些关于CL潜在发展的新颖思路。
translated by 谷歌翻译
模块化是持续学习(CL)的令人信服的解决方案,是相关任务建模的问题。学习和组合模块来解决不同的任务提供了一种抽象来解决CL的主要挑战,包括灾难性的遗忘,向后和向前传输跨任务以及子线性模型的增长。我们引入本地模块组成(LMC),该方法是模块化CL的方法,其中每个模块都提供了局部结构组件,其估计模块与输入的相关性。基于本地相关评分进行动态模块组合。我们展示了对任务身份(IDS)的不可知性来自(本地)结构学习,该结构学习是特定于模块和/或模型特定于以前的作品,使LMC适用于与以前的作品相比的更多CL设置。此外,LMC还跟踪输入分布的统计信息,并在检测到异常样本时添加新模块。在第一组实验中,LMC与最近的持续转移学习基准上的现有方法相比,不需要任务标识。在另一个研究中,我们表明结构学习的局部性允许LMC插入相关但未遵守的任务(OOD),以及在不同任务序列上独立于不同的任务序列培训的模块化网络,而无需任何微调。最后,在寻找LMC的限制,我们在30和100个任务的更具挑战性序列上研究它,展示了本地模块选择在存在大量候选模块时变得更具挑战性。在此设置中,与Oracle基准的基线相比,最佳执行LMC产生的模块更少,但它达到了较低的总体精度。 CodeBase可在https://github.com/oleksost/lmc下找到。
translated by 谷歌翻译
Catastrophic forgetting (CF) happens whenever a neural network overwrites past knowledge while being trained on new tasks. Common techniques to handle CF include regularization of the weights (using, e.g., their importance on past tasks), and rehearsal strategies, where the network is constantly re-trained on past data. Generative models have also been applied for the latter, in order to have endless sources of data. In this paper, we propose a novel method that combines the strengths of regularization and generative-based rehearsal approaches. Our generative model consists of a normalizing flow (NF), a probabilistic and invertible neural network, trained on the internal embeddings of the network. By keeping a single NF throughout the training process, we show that our memory overhead remains constant. In addition, exploiting the invertibility of the NF, we propose a simple approach to regularize the network's embeddings with respect to past tasks. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, with bounded computational power and memory overheads.
translated by 谷歌翻译
人的大脑能够依次地学习任务,而无需忘记。但是,深度神经网络(DNN)在学习一项任务时遭受灾难性遗忘。我们考虑了一个挑战,考虑了一个课堂学习方案,在该方案中,DNN看到测试数据而不知道该数据启动的任务。在培训期间,持续的捕获和选择(CP&S)在DNN中找到了负责解决给定任务的子网。然后,在推理期间,CP&S选择正确的子网以对该任务进行预测。通过培训DNN的可用神经元连接(以前未经训练)来创建一个新的子网络,从而通过修剪来学习一项新任务,该连接可以包括以前训练的其他子网络(S),因为它没有更新共享的连接,因为它可以属于其他子网络(S)。这使得通过在DNN中创建专门的区域而不会相互冲突的同时仍允许知识转移在其中,可以消除灾难性的遗忘。 CP&S策略采用不同的子网络选择策略实施,揭示了在各种数据集(CIFAR-100,CUB-200,2011年,Imagenet-100和Imagenet-100)上测试的最先进的持续学习方法的卓越性能。特别是,CP&S能够从Imagenet-1000中依次学习10个任务,以确保94%的精度,而遗忘可忽略不计,这是课堂学习学习的首要结果。据作者所知,与最佳替代方法相比,这表示准确性高于20%的改善。
translated by 谷歌翻译
Humans and animals have the ability to continually acquire, fine-tune, and transfer knowledge and skills throughout their lifespan. This ability, referred to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms that together contribute to the development and specialization of our sensorimotor skills as well as to long-term memory consolidation and retrieval. Consequently, lifelong learning capabilities are crucial for computational systems and autonomous agents interacting in the real world and processing continuous streams of information. However, lifelong learning remains a long-standing challenge for machine learning and neural network models since the continual acquisition of incrementally available information from non-stationary data distributions generally leads to catastrophic forgetting or interference. This limitation represents a major drawback for state-of-the-art deep neural network models that typically learn representations from stationary batches of training data, thus without accounting for situations in which information becomes incrementally available over time. In this review, we critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting. Although significant advances have been made in domain-specific learning with neural networks, extensive research efforts are required for the development of robust lifelong learning on autonomous agents and robots. We discuss well-established and emerging research motivated by lifelong learning factors in biological systems such as structural plasticity, memory replay, curriculum and transfer learning, intrinsic motivation, and multisensory integration.
translated by 谷歌翻译
Although deep learning approaches have stood out in recent years due to their state-of-the-art results, they continue to suffer from catastrophic forgetting, a dramatic decrease in overall performance when training with new classes added incrementally. This is due to current neural network architectures requiring the entire dataset, consisting of all the samples from the old as well as the new classes, to update the model-a requirement that becomes easily unsustainable as the number of classes grows. We address this issue with our approach to learn deep neural networks incrementally, using new data and only a small exemplar set corresponding to samples from the old classes. This is based on a loss composed of a distillation measure to retain the knowledge acquired from the old classes, and a cross-entropy loss to learn the new classes. Our incremental training is achieved while keeping the entire framework end-to-end, i.e., learning the data representation and the classifier jointly, unlike recent methods with no such guarantees. We evaluate our method extensively on the CIFAR-100 and Im-ageNet (ILSVRC 2012) image classification datasets, and show state-of-the-art performance.
translated by 谷歌翻译
恶意软件(恶意软件)分类为持续学习(CL)制度提供了独特的挑战,这是由于每天收到的新样本的数量以及恶意软件的发展以利用新漏洞。在典型的一天中,防病毒供应商将获得数十万个独特的软件,包括恶意和良性,并且在恶意软件分类器的一生中,有超过十亿个样品很容易积累。鉴于问题的规模,使用持续学习技术的顺序培训可以在减少培训和存储开销方面提供可观的好处。但是,迄今为止,还没有对CL应用于恶意软件分类任务的探索。在本文中,我们研究了11种应用于三个恶意软件任务的CL技术,涵盖了常见的增量学习方案,包括任务,类和域增量学习(IL)。具体而言,使用两个现实的大规模恶意软件数据集,我们评估了CL方法在二进制恶意软件分类(domain-il)和多类恶意软件家庭分类(Task-IL和类IL)任务上的性能。令我们惊讶的是,在几乎所有情况下,持续的学习方法显着不足以使训练数据的幼稚关节重播 - 在某些情况下,将精度降低了70个百分点以上。与关节重播相比,有选择性重播20%的存储数据的一种简单方法可以实现更好的性能,占训练时间的50%。最后,我们讨论了CL技术表现出乎意料差的潜在原因,希望它激发进一步研究在恶意软件分类域中更有效的技术。
translated by 谷歌翻译
我们研究了类新型小说类发现的新任务(class-incd),该任务是指在未标记的数据集中发现新型类别的问题,该问题通过利用已在包含脱节的标签数据集上训练的预训练的模型,该模型已受过培训但是相关类别。除了发现新颖的课程外,我们还旨在维护模型识别先前看到的基本类别的能力。受到基于彩排的增量学习方法的启发,在本文中,我们提出了一种新颖的方法,以防止通过共同利用基类功能原型和特征级知识蒸馏来忘记对基础类的过去信息。我们还提出了一种自我训练的聚类策略,该策略同时将新颖的类别簇簇,并为基础和新颖类培训共同分类器。这使得我们的方法能够在课堂内设置中运行。我们的实验以三个共同的基准进行,表明我们的方法显着优于最先进的方法。代码可从https://github.com/oatmealliu/class-incd获得
translated by 谷歌翻译
Graph learning is a popular approach for performing machine learning on graph-structured data. It has revolutionized the machine learning ability to model graph data to address downstream tasks. Its application is wide due to the availability of graph data ranging from all types of networks to information systems. Most graph learning methods assume that the graph is static and its complete structure is known during training. This limits their applicability since they cannot be applied to problems where the underlying graph grows over time and/or new tasks emerge incrementally. Such applications require a lifelong learning approach that can learn the graph continuously and accommodate new information whilst retaining previously learned knowledge. Lifelong learning methods that enable continuous learning in regular domains like images and text cannot be directly applied to continuously evolving graph data, due to its irregular structure. As a result, graph lifelong learning is gaining attention from the research community. This survey paper provides a comprehensive overview of recent advancements in graph lifelong learning, including the categorization of existing methods, and the discussions of potential applications and open research problems.
translated by 谷歌翻译
持续的学习方法通​​过试图解决灾难性遗忘来帮助深度神经网络模型适应和逐步学习。但是,无论这些现有方法是否传统上应用于基于图像的任务,都具有与移动或嵌入式传感系统生成的顺序时间序列数据相同的疗效仍然是一个未解决的问题。为了解决这一空白,我们进行了第一项全面的经验研究,该研究量化了三个主要的持续学习方案的性能(即,在三个移动和嵌入式感应应用程序中的六个数据集中的三个主要的持续学习方案(即正规化,重播和重播)的性能。不同的学习复杂性。更具体地说,我们在Edge设备上实现了端到端连续学习框架。然后,我们研究了不同持续学习方法的性能,存储,计算成本和记忆足迹之间的普遍性,权衡。我们的发现表明,以示例性计划(例如ICARL)重播,即使在复杂的场景中,甚至在复杂的场景中都具有最佳的性能权衡,以牺牲一些存储空间(少数MB)来训练示例(1%至5%)。我们还首次证明,以有限的记忆预算进行连续学习,可行和实用。特别是,两种类型的移动设备和嵌入式设备的延迟表明,可以接受递增的学习时间(几秒钟-4分钟)和培训时间(1-75分钟),可以接受,因为嵌入式嵌入式时可能会在设备上进行培训设备正在充电,从而确保完整的数据隐私。最后,我们为希望将不断学习范式应用于移动传感任务的从业者提供了一些准则。
translated by 谷歌翻译
机器学习模型通常会遇到与训练分布不同的样本。无法识别分布(OOD)样本,因此将该样本分配给课堂标签会显着损害模​​型的可靠性。由于其对在开放世界中的安全部署模型的重要性,该问题引起了重大关注。由于对所有可能的未知分布进行建模的棘手性,检测OOD样品是具有挑战性的。迄今为止,一些研究领域解决了检测陌生样本的问题,包括异常检测,新颖性检测,一级学习,开放式识别识别和分布外检测。尽管有相似和共同的概念,但分别分布,开放式检测和异常检测已被独立研究。因此,这些研究途径尚未交叉授粉,创造了研究障碍。尽管某些调查打算概述这些方法,但它们似乎仅关注特定领域,而无需检查不同领域之间的关系。这项调查旨在在确定其共同点的同时,对各个领域的众多著名作品进行跨域和全面的审查。研究人员可以从不同领域的研究进展概述中受益,并协同发展未来的方法。此外,据我们所知,虽然进行异常检测或单级学习进行了调查,但没有关于分布外检测的全面或最新的调查,我们的调查可广泛涵盖。最后,有了统一的跨域视角,我们讨论并阐明了未来的研究线,打算将这些领域更加紧密地融为一体。
translated by 谷歌翻译
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage.
translated by 谷歌翻译
The International Workshop on Reading Music Systems (WoRMS) is a workshop that tries to connect researchers who develop systems for reading music, such as in the field of Optical Music Recognition, with other researchers and practitioners that could benefit from such systems, like librarians or musicologists. The relevant topics of interest for the workshop include, but are not limited to: Music reading systems; Optical music recognition; Datasets and performance evaluation; Image processing on music scores; Writer identification; Authoring, editing, storing and presentation systems for music scores; Multi-modal systems; Novel input-methods for music to produce written music; Web-based Music Information Retrieval services; Applications and projects; Use-cases related to written music. These are the proceedings of the 3rd International Workshop on Reading Music Systems, held in Alicante on the 23rd of July 2021.
translated by 谷歌翻译
Anomaly Detection is a relevant problem that arises in numerous real-world applications, especially when dealing with images. However, there has been little research for this task in the Continual Learning setting. In this work, we introduce a novel approach called SCALE (SCALing is Enough) to perform Compressed Replay in a framework for Anomaly Detection in Continual Learning setting. The proposed technique scales and compresses the original images using a Super Resolution model which, to the best of our knowledge, is studied for the first time in the Continual Learning setting. SCALE can achieve a high level of compression while maintaining a high level of image reconstruction quality. In conjunction with other Anomaly Detection approaches, it can achieve optimal results. To validate the proposed approach, we use a real-world dataset of images with pixel-based anomalies, with the scope to provide a reliable benchmark for Anomaly Detection in the context of Continual Learning, serving as a foundation for further advancements in the field.
translated by 谷歌翻译
In this paper we introduce a model of lifelong learning, based on a Network of Experts. New tasks / experts are learned and added to the model sequentially, building on what was learned before. To ensure scalability of this process, data from previous tasks cannot be stored and hence is not available when learning a new task. A critical issue in such context, not addressed in the literature so far, relates to the decision which expert to deploy at test time. We introduce a set of gating autoencoders that learn a representation for the task at hand, and, at test time, automatically forward the test sample to the relevant expert. This also brings memory efficiency as only one expert network has to be loaded into memory at any given time. Further, the autoencoders inherently capture the relatedness of one task to another, based on which the most relevant prior model to be used for training a new expert, with fine-tuning or learningwithout-forgetting, can be selected. We evaluate our method on image classification and video prediction problems.
translated by 谷歌翻译
机器学习的一个显着缺点是模型能够更快地解决新问题,而不会忘记获得的知识。为了更好地理解这个问题,已经出现了持续的学习来系统地调查学习协议,其中模型顺序地观察由一系列任务产生的样本。首先,我们提出了一种促进学习和遗忘之间进行权衡的最优性原则。我们从有界合理性的信息化学制定中获得了这一原则,并显示了与其他连续学习方法的联系。其次,基于这一原则,我们提出了一种神经网络层,用于持续学习,称为变分的专家(移动),缓解遗忘,同时使知识有益转移到新任务。我们对MNIST和CIFAR10数据集的变型的实验表明,与最先进的方法相比,移动层的竞争性能。
translated by 谷歌翻译
课堂学习学习需要可塑性和稳定性,以便在保留过去的知识的同时从新数据中学习。由于灾难性的遗忘,当没有内存缓冲区可用时,在这两个属性之间找到妥协尤其具有挑战性。主流方法需要存储两个深层模型,因为它们使用微调与以前的增量状态的知识蒸馏一起整合了新类。我们提出了一种具有相似数量参数但分布不同的方法,以便在可塑性和稳定性之间找到更好的平衡。遵循已经通过基于转移的增量方法部署的方法,我们在初始状态后冻结了功能提取器。最古老的增量状态的类对这种冷冻提取器进行训练,以确保稳定性。使用部分微调模型预测最近的类别以引入可塑性。我们提出的可塑性层可以纳入任何用于无内存增量学习的基于转移的方法,并将其应用于两种此类方法。评估是通过三个大型数据集进行的。结果表明,与现有方法相比,所有测试的配置中均获得了性能提高。
translated by 谷歌翻译