我们展示了一个通过ImageNet(Optip)问题,旨在研究流媒体环境中深度学习的有效性。 Imagenet是一个广泛的已知基准数据集,有助于推动和评估深度学习的最新进步。通常,深度学习方法训练在模型具有随机访问的静态数据上,使用多次通过数据集,在每个时期的训练中随机随机抽搐。这种数据访问假设在许多真实情景中不存在,其中从流中收集大规模数据并存储和访问所有数据由于存储成本和隐私问题而变得不切实际。对于拍摄,我们将ImageNet数据视为顺序到达,内存预算有限的内存预算来存储一个小的数据子集。我们观察到,在单次训练中培训一个深度网络,用于多挪训练导致预测准确性的巨大降低。我们表明,尽管对典型的连续问题设置不同,但通过支付小的记忆成本和利用为持续学习的技术来说,可以显着降低性能差距。我们建议使用参考学习资源有效的深度学习。
translated by 谷歌翻译
持续学习(CL)旨在开发单一模型适应越来越多的任务的技术,从而潜在地利用跨任务的学习以资源有效的方式。 CL系统的主要挑战是灾难性的遗忘,在学习新任务时忘记了早期的任务。为了解决此问题,基于重播的CL方法在遇到遇到任务中选择的小缓冲区中维护和重复培训。我们提出梯度Coreset重放(GCR),一种新颖的重播缓冲区选择和使用仔细设计的优化标准的更新策略。具体而言,我们选择并维护一个“Coreset”,其与迄今为止关于当前模型参数的所有数据的梯度紧密近似,并讨论其有效应用于持续学习设置所需的关键策略。在学习的离线持续学习环境中,我们在最先进的最先进的最先进的持续学习环境中表现出显着的收益(2%-4%)。我们的调查结果还有效地转移到在线/流媒体CL设置,从而显示现有方法的5%。最后,我们展示了持续学习的监督对比损失的价值,当与我们的子集选择策略相结合时,累计增益高达5%。
translated by 谷歌翻译
A continual learning agent learns online with a non-stationary and never-ending stream of data. The key to such learning process is to overcome the catastrophic forgetting of previously seen data, which is a well known problem of neural networks. To prevent forgetting, a replay buffer is usually employed to store the previous data for the purpose of rehearsal. Previous works often depend on task boundary and i.i.d. assumptions to properly select samples for the replay buffer. In this work, we formulate sample selection as a constraint reduction problem based on the constrained optimization view of continual learning. The goal is to select a fixed subset of constraints that best approximate the feasible region defined by the original constraints. We show that it is equivalent to maximizing the diversity of samples in the replay buffer with parameters gradient as the feature. We further develop a greedy alternative that is cheap and efficient. The advantage of the proposed method is demonstrated by comparing to other alternatives under the continual learning setting. Further comparisons are made against state of the art methods that rely on task boundaries which show comparable or even better results for our method.
translated by 谷歌翻译
在线持续学习是一个充满挑战的学习方案,模型必须从非平稳的数据流中学习,其中每个样本只能看到一次。主要的挑战是在避免灾难性遗忘的同时逐步学习,即在从新数据中学习时忘记先前获得的知识的问题。在这种情况下,一种流行的解决方案是使用较小的内存来保留旧数据并随着时间的推移进行排练。不幸的是,由于内存尺寸有限,随着时间的推移,内存的质量会恶化。在本文中,我们提出了OLCGM,这是一种基于新型重放的持续学习策略,该策略使用知识冷凝技术连续压缩记忆并更好地利用其有限的尺寸。样品冷凝步骤压缩了旧样品,而不是像其他重播策略那样将其删除。结果,实验表明,每当与数据的复杂性相比,每当记忆预算受到限制,OLCGM都会提高与最先进的重播策略相比的最终准确性。
translated by 谷歌翻译
Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. 1 We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.
translated by 谷歌翻译
对于新应用程序,例如家庭机器人,智能手机的用户个性化以及增强/虚拟现实耳机,需要实时的持续学习持续学习。但是,此设置构成了独特的挑战:嵌入式设备的内存和计算能力有限,并且在非平稳数据流进行更新时,灾难性遗忘的常规机器学习模型会遭受损失。尽管已经开发了几种在线持续学习模型,但它们对嵌入式应用程序的有效性尚未进行严格研究。在本文中,我们首先确定在线持续学习者必须满足以有效执行实时,设备学习的标准。然后,当与移动神经网络一起使用时,我们研究了几种在线连续学习方法的功效。我们衡量他们的性能,内存使用情况,计算要求以及将其推广到分类外输入的能力。
translated by 谷歌翻译
A growing body of research in continual learning focuses on the catastrophic forgetting problem. While many attempts have been made to alleviate this problem, the majority of the methods assume a single model in the continual learning setup. In this work, we question this assumption and show that employing ensemble models can be a simple yet effective method to improve continual performance. However, ensembles' training and inference costs can increase significantly as the number of models grows. Motivated by this limitation, we study different ensemble models to understand their benefits and drawbacks in continual learning scenarios. Finally, to overcome the high compute cost of ensembles, we leverage recent advances in neural network subspace to propose a computationally cheap algorithm with similar runtime to a single model yet enjoying the performance benefits of ensembles.
translated by 谷歌翻译
Continual Learning is a step towards lifelong intelligence where models continuously learn from recently collected data without forgetting previous knowledge. Existing continual learning approaches mostly focus on image classification in the class-incremental setup with clear task boundaries and unlimited computational budget. This work explores Online Domain-Incremental Continual Segmentation~(ODICS), a real-world problem that arises in many applications, \eg, autonomous driving. In ODICS, the model is continually presented with batches of densely labeled images from different domains; computation is limited and no information about the task boundaries is available. In autonomous driving, this may correspond to the realistic scenario of training a segmentation model over time on a sequence of cities. We analyze several existing continual learning methods and show that they do not perform well in this setting despite working well in class-incremental segmentation. We propose SimCS, a parameter-free method complementary to existing ones that leverages simulated data as a continual learning regularizer. Extensive experiments show consistent improvements over different types of continual learning methods that use regularizers and even replay.
translated by 谷歌翻译
Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning -- a setting where not all the data samples are labeled. An underlying issue in this scenario is the model forgetting representations of unlabeled data and overfitting the labeled ones. We leverage the power of nearest-neighbor classifiers to non-linearly partition the feature space and learn a strong representation for the current task, as well as distill relevant information from previous tasks. We perform a thorough experimental evaluation and show that our method outperforms all the existing approaches by large margins, setting a strong state of the art on the continual semi-supervised learning paradigm. For example, on CIFAR100 we surpass several others even when using at least 30 times less supervision (0.8% vs. 25% of annotations).
translated by 谷歌翻译
根据互补学习系统(CLS)理论〜\ cite {mcclelland1995there}在神经科学中,人类通过两个补充系统有效\ emph {持续学习}:一种快速学习系统,以海马为中心,用于海马,以快速学习细节,个人体验,个人体验,个人体验,个人体验,个人体验,个人体验,个人体验,个人体验的快速学习, ;以及位于新皮层中的缓慢学习系统,以逐步获取有关环境的结构化知识。在该理论的激励下,我们提出\ emph {dualnets}(对于双网络),这是一个一般的持续学习框架,该框架包括一个快速学习系统,用于监督从特定任务和慢速学习系统中的模式分离代表学习,用于表示任务的慢学习系统 - 不可知论的一般代表通过自我监督学习(SSL)。双网符可以无缝地将两种表示类型纳入整体框架中,以促进在深层神经网络中更好地持续学习。通过广泛的实验,我们在各种持续的学习协议上展示了双网络的有希望的结果,从标准离线,任务感知设置到具有挑战性的在线,无任务的场景。值得注意的是,在Ctrl〜 \ Cite {veniat2020202020202020202020202020202020202020202020202020202020202021- coite {ostapenko2021-continual}的基准中。此外,我们进行了全面的消融研究,以验证双nets功效,鲁棒性和可伸缩性。代码可在\ url {https://github.com/phquang/dualnet}上公开获得。
translated by 谷歌翻译
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage.
translated by 谷歌翻译
The ability to dynamically adapt neural networks to newly-available data without performance deterioration would revolutionize deep learning applications. Streaming learning (i.e., learning from one data example at a time) has the potential to enable such real-time adaptation, but current approaches i) freeze a majority of network parameters during streaming and ii) are dependent upon offline, base initialization procedures over large subsets of data, which damages performance and limits applicability. To mitigate these shortcomings, we propose Cold Start Streaming Learning (CSSL), a simple, end-to-end approach for streaming learning with deep networks that uses a combination of replay and data augmentation to avoid catastrophic forgetting. Because CSSL updates all model parameters during streaming, the algorithm is capable of beginning streaming from a random initialization, making base initialization optional. Going further, the algorithm's simplicity allows theoretical convergence guarantees to be derived using analysis of the Neural Tangent Random Feature (NTRF). In experiments, we find that CSSL outperforms existing baselines for streaming learning in experiments on CIFAR100, ImageNet, and Core50 datasets. Additionally, we propose a novel multi-task streaming learning setting and show that CSSL performs favorably in this domain. Put simply, CSSL performs well and demonstrates that the complicated, multi-step training pipelines adopted by most streaming methodologies can be replaced with a simple, end-to-end learning approach without sacrificing performance.
translated by 谷歌翻译
我们探索无任务持续学习(CL),其中培训模型以避免在没有明确的任务边界或身份的情况下造成灾难性的遗忘。在无任务CL上的许多努力中,一个值得注意的方法是基于内存的,存储和重放训练示例的子集。然而,由于CL模型不断更新,所以存储的示例的效用可以随时间缩短。这里,我们提出基于梯度的存储器编辑(GMED),该框架是通过梯度更新在连续输入空间中编辑存储的示例的框架,以便为重放创建更多的“具有挑战性”示例。 GMED编辑的例子仍然类似于其未编辑的形式,但可以在即将到来的模型更新中产生增加的损失,从而使未来的重播在克服灾难性遗忘方面更有效。通过施工,GMED可以与其他基于内存的CL算法一起无缝应用,以进一步改进。实验验证了GMED的有效性,以及我们最好的方法显着优于基线和以前的五个数据集中的最先进。可以在https://github.com/ink-usc/gmed找到代码。
translated by 谷歌翻译
Continual Learning, also known as Lifelong or Incremental Learning, has recently gained renewed interest among the Artificial Intelligence research community. Recent research efforts have quickly led to the design of novel algorithms able to reduce the impact of the catastrophic forgetting phenomenon in deep neural networks. Due to this surge of interest in the field, many competitions have been held in recent years, as they are an excellent opportunity to stimulate research in promising directions. This paper summarizes the ideas, design choices, rules, and results of the challenge held at the 3rd Continual Learning in Computer Vision (CLVision) Workshop at CVPR 2022. The focus of this competition is the complex continual object detection task, which is still underexplored in literature compared to classification tasks. The challenge is based on the challenge version of the novel EgoObjects dataset, a large-scale egocentric object dataset explicitly designed to benchmark continual learning algorithms for egocentric category-/instance-level object understanding, which covers more than 1k unique main objects and 250+ categories in around 100k video frames.
translated by 谷歌翻译
对比方法导致了最近的自我监督表示学习(SSL)的表现激增。诸如BYOL或SIMSIAM之类的最新方法据称将这些对比方法提炼为它们的本质,消除了钟声和哨子,包括负面示例,这些示例不影响下游性能。这些“非对比度”方法的工作非常出色,而无需使用负面因素,即使全球最低限度的崩溃都在淡化。我们通过经验分析了这些非对抗性方法,发现Simsiam对数据集和模型大小非常敏感。特别是,如果模型相对于数据集大小而言太小,则SIMSIAM表示会经历部分维度崩溃。我们提出了一个度量标准来测量这种崩溃的程度,并表明它可以用于预测下游任务性能,而无需任何微调或标签。我们进一步分析建筑设计选择及其对下游性能的影响。最后,我们证明,转移到持续的学习设置充当正规化器并防止崩溃,并且在Imagenet上使用Resnet-18,连续和多上述训练之间的混合物可以提高线性探针精度多达18个百分点。
translated by 谷歌翻译
We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems. Instead of tackling continual learning via the use of external memory, growing models, or regularization, EBMs change the underlying training objective to cause less interference with previously learned information. Our proposed version of EBMs for continual learning is simple, efficient, and outperforms baseline methods by a large margin on several benchmarks. Moreover, our proposed contrastive divergence-based training objective can be combined with other continual learning methods, resulting in substantial boosts in their performance. We further show that EBMs are adaptable to a more general continual learning setting where the data distribution changes without the notion of explicitly delineated tasks. These observations point towards EBMs as a useful building block for future continual learning methods.
translated by 谷歌翻译
Continual Learning is considered a key step toward next-generation Artificial Intelligence. Among various methods, replay-based approaches that maintain and replay a small episodic memory of previous samples are one of the most successful strategies against catastrophic forgetting. However, since forgetting is inevitable given bounded memory and unbounded tasks, how to forget is a problem continual learning must address. Therefore, beyond simply avoiding catastrophic forgetting, an under-explored issue is how to reasonably forget while ensuring the merits of human memory, including 1. storage efficiency, 2. generalizability, and 3. some interpretability. To achieve these simultaneously, our paper proposes a new saliency-augmented memory completion framework for continual learning, inspired by recent discoveries in memory completion separation in cognitive neuroscience. Specifically, we innovatively propose to store the part of the image most important to the tasks in episodic memory by saliency map extraction and memory encoding. When learning new tasks, previous data from memory are inpainted by an adaptive data generation module, which is inspired by how humans complete episodic memory. The module's parameters are shared across all tasks and it can be jointly trained with a continual learning classifier as bilevel optimization. Extensive experiments on several continual learning and image classification benchmarks demonstrate the proposed method's effectiveness and efficiency.
translated by 谷歌翻译
Continual Learning (CL) is an emerging machine learning paradigm that aims to learn from a continuous stream of tasks without forgetting knowledge learned from the previous tasks. To avoid performance decrease caused by forgetting, prior studies exploit episodic memory (EM), which stores a subset of the past observed samples while learning from new non-i.i.d. data. Despite the promising results, since CL is often assumed to execute on mobile or IoT devices, the EM size is bounded by the small hardware memory capacity and makes it infeasible to meet the accuracy requirements for real-world applications. Specifically, all prior CL methods discard samples overflowed from the EM and can never retrieve them back for subsequent training steps, incurring loss of information that would exacerbate catastrophic forgetting. We explore a novel hierarchical EM management strategy to address the forgetting issue. In particular, in mobile and IoT devices, real-time data can be stored not just in high-speed RAMs but in internal storage devices as well, which offer significantly larger capacity than the RAMs. Based on this insight, we propose to exploit the abundant storage to preserve past experiences and alleviate the forgetting by allowing CL to efficiently migrate samples between memory and storage without being interfered by the slow access speed of the storage. We call it Carousel Memory (CarM). As CarM is complementary to existing CL methods, we conduct extensive evaluations of our method with seven popular CL methods and show that CarM significantly improves the accuracy of the methods across different settings by large margins in final average accuracy (up to 28.4%) while retaining the same training efficiency.
translated by 谷歌翻译
在线持续学习(OCL)旨在通过单个通过数据从非平稳数据流进行逐步训练神经网络。基于彩排的方法试图用少量的内存近似观察到的输入分布,并以后重新审视它们以避免忘记。尽管具有强烈的经验表现,但排练方法仍然遭受了过去数据损失景观和记忆样本的差异。本文重新讨论了在线设置中的排练动态。我们从偏见和动态的经验风险最小化的角度从固有的内存过度拟合风险中提供了理论见解,并检查重复排练的优点和限制。受我们的分析的启发,一个简单而直观的基线,重复的增强彩排(RAR)旨在解决在线彩排的拟合不足的困境。令人惊讶的是,在四个相当不同的OCL基准测试中,这种简单的基线表现优于香草排练9%-17%,并且显着改善了基于最新的彩排方法miR,ASER和SCR。我们还证明,RAR成功地实现了过去数据的损失格局和其学习轨迹中的高损失山脊厌恶的准确近似。进行了广泛的消融研究,以研究重复和增强彩排和增强学习(RL)之间的相互作用(RL),以动态调整RAR的超参数以平衡在线稳定性 - 塑性权衡折衷。
translated by 谷歌翻译
持续学习需要模型来学习新任务,同时保持先前学识到的知识。已经提出了各种算法来解决这一真正的挑战。到目前为止,基于排练的方法,例如经验重播,取得了最先进的性能。这些方法将过去任务的一小部分保存为内存缓冲区,以防止模型忘记以前学识的知识。但是,它们中的大多数情况都同样对待每一个新任务,即,在学习不同的新任务时修复了框架的超级参数。这样的设置缺乏对过去和新任务之间的关系/相似性的考虑。例如,与从公共汽车中学到的人相比,从狗的知识/特征比识别猫(新任务)更有益。在这方面,我们提出了一种基于BI级优化的元学习算法,以便自适应地调整从过去和新任务中提取的知识之间的关系。因此,该模型可以在持续学习期间找到适当的梯度方向,避免在内存缓冲区上的严重过度拟合问题。广泛的实验是在三个公开的数据集(即CiFar-10,CiFar-100和微小想象网)上进行的。实验结果表明,该方法可以一致地改善所有基线的性能。
translated by 谷歌翻译