智能论文笔记

Learning on non-stationary data with re-weighting

Nishant Jain , Pradeep Shenoy

分类：机器学习

2022-12-12

Many real-world learning scenarios face the challenge of slow concept drift, where data distributions change gradually over time. In this setting, we pose the problem of learning temporally sensitive importance weights for training data, in order to optimize predictive accuracy. We propose a class of temporal reweighting functions that can capture multiple timescales of change in the data, as well as instance-specific characteristics. We formulate a bi-level optimization criterion, and an associated meta-learning algorithm, by which these weights can be learned. In particular, our formulation trains an auxiliary network to output weights as a function of training instances, thereby compactly representing the instance weights. We validate our temporal reweighting scheme on a large real-world dataset of 39M images spread over a 9 year period. Our extensive experiments demonstrate the necessity of instance-based temporal reweighting in the dataset, and achieve significant improvements to classical batch-learning approaches. Further, our proposal easily generalizes to a streaming setting and shows significant gains compared to recent continual learning methods.

translated by 谷歌翻译

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

Matthew Riemer , Ignacio Cases , Robert Ajemian , Miao Liu , Irina Rish , Yuhai Tu , Gerald Tesauro

分类：

2018-10-29

Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. 1 We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.

translated by 谷歌翻译

GCR: Gradient Coreset Based Replay Buffer Selection For Continual Learning

Rishabh Tiwari , Krishnateja Killamsetty , Rishabh Iyer , Pradeep Shenoy

分类：机器学习 | 人工智能

2021-11-18

持续学习（CL）旨在开发单一模型适应越来越多的任务的技术，从而潜在地利用跨任务的学习以资源有效的方式。 CL系统的主要挑战是灾难性的遗忘，在学习新任务时忘记了早期的任务。为了解决此问题，基于重播的CL方法在遇到遇到任务中选择的小缓冲区中维护和重复培训。我们提出梯度Coreset重放（GCR），一种新颖的重播缓冲区选择和使用仔细设计的优化标准的更新策略。具体而言，我们选择并维护一个“Coreset”，其与迄今为止关于当前模型参数的所有数据的梯度紧密近似，并讨论其有效应用于持续学习设置所需的关键策略。在学习的离线持续学习环境中，我们在最先进的最先进的最先进的持续学习环境中表现出显着的收益（2％-4％）。我们的调查结果还有效地转移到在线/流媒体CL设置，从而显示现有方法的5％。最后，我们展示了持续学习的监督对比损失的价值，当与我们的子集选择策略相结合时，累计增益高达5％。

translated by 谷歌翻译

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange , Rahaf Aljundi , Marc Masana , Sarah Parisot , Xu Jia , Ales Leonardis , Gregory Slabaugh , Tinne Tuytelaars

分类：

2019-09-18

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern (1) a taxonomy and extensive overview of the state-of-the-art; (2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner; (3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage.

translated by 谷歌翻译

Dissecting Continual Learning a Structural and Data Analysis

Francesco Pelosin

分类：计算机视觉 | 机器学习

2023-01-03

Continual Learning (CL) is a field dedicated to devise algorithms able to achieve lifelong learning. Overcoming the knowledge disruption of previously acquired concepts, a drawback affecting deep learning models and that goes by the name of catastrophic forgetting, is a hard challenge. Currently, deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions, but whenever we expose such systems to this incremental setting, performance drop very quickly. Overcoming this limitation is fundamental as it would allow us to build truly intelligent systems showing stability and plasticity. Secondly, it would allow us to overcome the onerous limitation of retraining these architectures from scratch with the new updated data. In this thesis, we tackle the problem from multiple directions. In a first study, we show that in rehearsal-based techniques (systems that use memory buffer), the quantity of data stored in the rehearsal buffer is a more important factor over the quality of the data. Secondly, we propose one of the early works of incremental learning on ViTs architectures, comparing functional, weight and attention regularization approaches and propose effective novel a novel asymmetric loss. At the end we conclude with a study on pretraining and how it affects the performance in Continual Learning, raising some questions about the effective progression of the field. We then conclude with some future directions and closing remarks.

translated by 谷歌翻译

Understanding Continual Learning Settings with Data Distribution Drift Analysis

Timothée Lesort , Massimo Caccia , Irina Rish

分类：机器学习 | 人工智能

2021-04-04

经典的机器学习算法通常假设绘制数据是i.i.d的。来自固定概率分布。最近，持续学习成为机器学习的快速增长领域，在该领域中，该假设放松，即数据分布是非平稳的，并且随着时间的推移而变化。本文通过上下文变量$ c $表示数据分布的状态。 $ c $的漂移导致数据分布漂移。上下文漂移可能会改变目标分布，输入分布或两者兼而有之。此外，分布漂移可能是突然的或逐渐的。在持续学习中，环境漂移可能会干扰学习过程并擦除以前学习的知识。因此，持续学习算法必须包括处理此类漂移的专业机制。在本文中，我们旨在识别和分类不同类型的上下文漂移和潜在的假设，以更好地表征各种持续学习的场景。此外，我们建议使用分布漂移框架来提供对连续学习领域常用的几个术语的更精确的定义。

translated by 谷歌翻译

The CLEAR Benchmark: Continual LEArning on Real-World Imagery

Zhiqiu Lin , Jia Shi , Deepak Pathak , Deva Ramanan

分类：计算机视觉 | 人工智能 | 机器学习

2022-01-17

持续学习（CL）被广泛认为是终身AI的关键挑战。但是，现有的CLENG分类，例如置换式和拆分式剪裁，利用人工时间变化，不与现实世界一致或不一致。在本文中，我们介绍了Clear，这是第一个连续的图像分类基准数据集，其在现实世界中具有自然的视觉概念的时间演变，它跨越了十年（2004-2014）。我们通过现有的大规模图像集（YFCC100M）清楚地清楚地通过一种新颖且可扩展的低成本方法来进行粘性语言数据集策划。我们的管道利用了预处理的视觉语言模型（例如剪辑）来互动地构建标记的数据集，这些数据集通过众包进一步验证以删除错误甚至不适当的图像（隐藏在原始YFCC100M中）。在先前的CLENMACK上，明确的主要优势是具有现实世界图像的视觉概念的平滑时间演变，包括每个时间段的高质量标记数据以及丰富的未标记样本，用于连续半惯用的学习。我们发现，一个简单的无监督预训练步骤已经可以提高只能利用完全监督数据的最新CL算法。我们的分析还表明，主流CL评估方案训练和测试IID数据人为膨胀CL系统的性能。为了解决这个问题，我们为CL提出了新颖的“流”协议，该协议始终在（近）未来测试。有趣的是，流媒体协议（a）可以简化数据集策划，因为当今的测试集可以重新用于明天的火车集，并且（b）可以生成更具概括性的模型，具有更准确的性能估算，因为每个时间段的所有标记数据都用于培训和培训，并且测试（与经典的IID火车测试拆分不同）。

translated by 谷歌翻译

A survey on concept drift adaptation

分类：

Concept drift primarily refers to an online supervised learning scenario when the relation between the input data and the target variable changes over time. Assuming a general knowledge of supervised learning in this paper we characterize adaptive learning process, categorize existing strategies for handling concept drift, overview the most representative, distinct and popular techniques and algorithms, discuss evaluation methodology of adaptive algorithms, and present a set of illustrative applications. The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state-of-the-art. Thus, it aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts and practitioners.

translated by 谷歌翻译

FLUID: A Unified Evaluation Framework for Flexible Sequential Data

Matthew Wallingford , Aditya Kusupati , Keivan Alizadeh-Vahid , Aaron Walsman , Aniruddha Kembhavi , Ali Farhadi

分类：计算机视觉 | 机器学习

2020-07-06

现代ML方法在培训数据是IID，大规模和良好标记的时候Excel。在不太理想的条件下学习仍然是一个开放的挑战。在不利条件下，几次射击，持续的，转移和代表学习的子场在学习中取得了很大的进步;通过方法和见解，每个都提供了独特的优势。这些方法解决了不同的挑战，例如依次到达的数据或稀缺的训练示例，然而，在部署之前，ML系统将面临困难的条件。因此，需要可以处理实际设置中许多学习挑战的一般ML系统。为了促进一般ML方法目标的研究，我们介绍了一个新的统一评估框架 - 流体（灵活的顺序数据）。流体集成了几次拍摄，持续的，转移和表示学习的目标，同时能够比较和整合这些子场的技术。在流体中，学习者面临数据流，并且必须在选择如何更新自身时进行顺序预测，快速调整到新颖的类别，并处理更改的数据分布;虽然会计计算总额。我们对广泛的方法进行实验，这些方法阐述了新的洞察当前解决方案的优缺点并表明解决了新的研究问题。作为更一般方法的起点，我们展示了两种新的基线，其在流体上优于其他评估的方法。项目页面：https：//raivn.cs.washington.edu/projects/fluid/。

translated by 谷歌翻译

SimCS: Simulation for Online Domain-Incremental Continual Segmentation

Motasem Alfarra , Zhipeng Cai , Adel Bibi , Bernard Ghanem , Matthias Müller

分类：计算机视觉 | 机器学习

2022-11-29

Continual Learning is a step towards lifelong intelligence where models continuously learn from recently collected data without forgetting previous knowledge. Existing continual learning approaches mostly focus on image classification in the class-incremental setup with clear task boundaries and unlimited computational budget. This work explores Online Domain-Incremental Continual Segmentation~(ODICS), a real-world problem that arises in many applications, \eg, autonomous driving. In ODICS, the model is continually presented with batches of densely labeled images from different domains; computation is limited and no information about the task boundaries is available. In autonomous driving, this may correspond to the realistic scenario of training a segmentation model over time on a sequence of cities. We analyze several existing continual learning methods and show that they do not perform well in this setting despite working well in class-incremental segmentation. We propose SimCS, a parameter-free method complementary to existing ones that leverages simulated data as a continual learning regularizer. Extensive experiments show consistent improvements over different types of continual learning methods that use regularizers and even replay.

translated by 谷歌翻译

Beyond Supervised Continual Learning: a Review

Benedikt Bagus , Alexander Gepperth , Timothée Lesort

分类：机器学习

2022-08-30

持续学习（CL，有时也称为增量学习）是机器学习的一种味道，在该口味中，通常会放松或省略固定数据分布的通常假设。当天然应用时，例如CL问题中的DNNS时，数据分布的变化会导致所谓的灾难性遗忘（CF）效应：突然丧失了先前的知识。尽管近年来已经为启用CL做出了许多重大贡献，但大多数作品都解决了受监督的（分类）问题。本文回顾了在其他环境中研究CL的文献，例如通过减少监督，完全无监督的学习和强化学习的学习。除了提出一个简单的模式用于分类CL方法W.R.T.他们的自主权和监督水平，我们讨论了与每种设置相关的具体挑战以及对CL领域的潜在贡献。

translated by 谷歌翻译

HTML版本

Selective classification using a robust meta-learning approach

Nishant Jain , Pradeep Shenoy

分类：机器学习

2022-12-12

Selective classification involves identifying the subset of test samples that a model can classify with high accuracy, and is important for applications such as automated medical diagnosis. We argue that this capability of identifying uncertain samples is valuable for training classifiers as well, with the aim of building more accurate classifiers. We unify these dual roles by training a single auxiliary meta-network to output an importance weight as a function of the instance. This measure is used at train time to reweight training data, and at test-time to rank test instances for selective classification. A second, key component of our proposal is the meta-objective of minimizing dropout variance (the variance of classifier output when subjected to random weight dropout) for training the metanetwork. We train the classifier together with its metanetwork using a nested objective of minimizing classifier loss on training data and meta-loss on a separate meta-training dataset. We outperform current state-of-the-art on selective classification by substantial margins--for instance, upto 1.9% AUC and 2% accuracy on a real-world diabetic retinopathy dataset. Finally, our meta-learning framework extends naturally to unsupervised domain adaptation, given our unsupervised variance minimization meta-objective. We show cumulative absolute gains of 3.4% / 3.3% accuracy and AUC over the other baselines in domain shift settings on the Retinopathy dataset using unsupervised domain adaptation.

translated by 谷歌翻译

Continual Learning: Fast and Slow

Quang Pham , Chenghao Liu , Steven C. H. Hoi

分类：人工智能 | 计算机视觉 | 机器学习

2022-09-06

根据互补学习系统（CLS）理论〜\ cite {mcclelland1995there}在神经科学中，人类通过两个补充系统有效\ emph {持续学习}：一种快速学习系统，以海马为中心，用于海马，以快速学习细节，个人体验，个人体验，个人体验，个人体验，个人体验，个人体验，个人体验，个人体验的快速学习， ;以及位于新皮层中的缓慢学习系统，以逐步获取有关环境的结构化知识。在该理论的激励下，我们提出\ emph {dualnets}（对于双网络），这是一个一般的持续学习框架，该框架包括一个快速学习系统，用于监督从特定任务和慢速学习系统中的模式分离代表学习，用于表示任务的慢学习系统 - 不可知论的一般代表通过自我监督学习（SSL）。双网符可以无缝地将两种表示类型纳入整体框架中，以促进在深层神经网络中更好地持续学习。通过广泛的实验，我们在各种持续的学习协议上展示了双网络的有希望的结果，从标准离线，任务感知设置到具有挑战性的在线，无任务的场景。值得注意的是，在Ctrl〜 \ Cite {veniat2020202020202020202020202020202020202020202020202020202020202021- coite {ostapenko2021-continual}的基准中。此外，我们进行了全面的消融研究，以验证双nets功效，鲁棒性和可伸缩性。代码可在\ url {https://github.com/phquang/dualnet}上公开获得。

translated by 谷歌翻译

On the Limitations of Continual Learning for Malware Classification

Mohammad Saidur Rahman , Scott E. Coull , Matthew Wright

分类：人工智能 | 机器学习

2022-08-13

恶意软件（恶意软件）分类为持续学习（CL）制度提供了独特的挑战，这是由于每天收到的新样本的数量以及恶意软件的发展以利用新漏洞。在典型的一天中，防病毒供应商将获得数十万个独特的软件，包括恶意和良性，并且在恶意软件分类器的一生中，有超过十亿个样品很容易积累。鉴于问题的规模，使用持续学习技术的顺序培训可以在减少培训和存储开销方面提供可观的好处。但是，迄今为止，还没有对CL应用于恶意软件分类任务的探索。在本文中，我们研究了11种应用于三个恶意软件任务的CL技术，涵盖了常见的增量学习方案，包括任务，类和域增量学习（IL）。具体而言，使用两个现实的大规模恶意软件数据集，我们评估了CL方法在二进制恶意软件分类（domain-il）和多类恶意软件家庭分类（Task-IL和类IL）任务上的性能。令我们惊讶的是，在几乎所有情况下，持续的学习方法显着不足以使训练数据的幼稚关节重播 - 在某些情况下，将精度降低了70个百分点以上。与关节重播相比，有选择性重播20％的存储数据的一种简单方法可以实现更好的性能，占训练时间的50％。最后，我们讨论了CL技术表现出乎意料差的潜在原因，希望它激发进一步研究在恶意软件分类域中更有效的技术。

translated by 谷歌翻译

Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Anshul Nasery , Soumyadeep Thakur , Vihari Piratla , Abir De , Sunita Sarawagi

分类：机器学习 | (统计)机器学习

2021-08-15

在几个真实的世界应用中，部署机器学习模型以使数据对分布逐渐变化的数据进行预测，导致火车和测试分布之间的漂移。这些模型通常会定期在新数据上重新培训，因此他们需要概括到未来的数据。在这种情况下，有很多关于提高时间概括的事先工作，例如，过去数据的连续运输，内核平滑时间敏感参数，最近，越来越多的时间不变的功能。但是，这些方法共享了几个限制，例如可扩展性差，培训不稳定，以及未来未标记数据的依赖性。响应上述限制，我们提出了一种简单的方法，该方法以时间敏感的参数开头，但使用梯度插值（GI）丢失来规则地规则化其时间复杂度。 GI允许决策边界沿着时间改变，并且仍然可以通过允许特定于时间的改变来防止对有限训练时间快照的过度接种。我们将我们的方法与多个实际数据集的现有基线进行比较，这表明GI一方面优于更加复杂的生成和对抗方法，另一方面更简单地梯度正则化方法。

translated by 谷歌翻译

Learning from Data Streams: An Overview and Update

Jesse Read , Indrė Žliobaitė

分类：机器学习

2022-12-30

The literature on machine learning in the context of data streams is vast and growing. However, many of the defining assumptions regarding data-stream learning tasks are too strong to hold in practice, or are even contradictory such that they cannot be met in the contexts of supervised learning. Algorithms are chosen and designed based on criteria which are often not clearly stated, for problem settings not clearly defined, tested in unrealistic settings, and/or in isolation from related approaches in the wider literature. This puts into question the potential for real-world impact of many approaches conceived in such contexts, and risks propagating a misguided research focus. We propose to tackle these issues by reformulating the fundamental definitions and settings of supervised data-stream learning with regard to contemporary considerations of concept drift and temporal dependence; and we take a fresh look at what constitutes a supervised data-stream learning task, and a reconsideration of algorithms that may be applied to tackle such tasks. Through and in reflection of this formulation and overview, helped by an informal survey of industrial players dealing with real-world data streams, we provide recommendations. Our main emphasis is that learning from data streams does not impose a single-pass or online-learning approach, or any particular learning regime; and any constraints on memory and time are not specific to streaming. Meanwhile, there exist established techniques for dealing with temporal dependence and concept drift, in other areas of the literature. For the data streams community, we thus encourage a shift in research focus, from dealing with often-artificial constraints and assumptions on the learning mode, to issues such as robustness, privacy, and interpretability which are increasingly relevant to learning in data streams in academic and industrial settings.

translated by 谷歌翻译

Online meta-learning

分类：

A central capability of intelligent systems is the ability to continuously build upon previous experiences to speed up and enhance learning of new tasks. Two distinct research paradigms have studied this question. Meta-learning views this problem as learning a prior over model parameters that is amenable for fast adaptation on a new task, but typically assumes the tasks are available together as a batch. In contrast, online (regret based) learning considers a setting where tasks are revealed one after the other, but conventionally trains a single model without task-specific adaptation. This work introduces an online meta-learning setting, which merges ideas from both paradigms to better capture the spirit and practice of continual lifelong learning. We propose the follow the meta leader (FTML) algorithm which extends the MAML algorithm to this setting. Theoretically, this work provides an O(log T ) regret guarantee with one additional higher order smoothness assumption (in comparison to the standard online setting). Our experimental evaluation on three different largescale problems suggest that the proposed algorithm significantly outperforms alternatives based on traditional online learning approaches.

translated by 谷歌翻译

A simple but strong baseline for online continual learning: Repeated Augmented Rehearsal

Yaqian Zhang , Bernhard Pfahringer , Eibe Frank , Albert Bifet , Nick Jin Sean Lim , Yunzhe Jia

分类：机器学习 | 人工智能

2022-09-28

在线持续学习（OCL）旨在通过单个通过数据从非平稳数据流进行逐步训练神经网络。基于彩排的方法试图用少量的内存近似观察到的输入分布，并以后重新审视它们以避免忘记。尽管具有强烈的经验表现，但排练方法仍然遭受了过去数据损失景观和记忆样本的差异。本文重新讨论了在线设置中的排练动态。我们从偏见和动态的经验风险最小化的角度从固有的内存过度拟合风险中提供了理论见解，并检查重复排练的优点和限制。受我们的分析的启发，一个简单而直观的基线，重复的增强彩排（RAR）旨在解决在线彩排的拟合不足的困境。令人惊讶的是，在四个相当不同的OCL基准测试中，这种简单的基线表现优于香草排练9％-17％，并且显着改善了基于最新的彩排方法miR，ASER和SCR。我们还证明，RAR成功地实现了过去数据的损失格局和其学习轨迹中的高损失山脊厌恶的准确近似。进行了广泛的消融研究，以研究重复和增强彩排和增强学习（RL）之间的相互作用（RL），以动态调整RAR的超参数以平衡在线稳定性 - 塑性权衡折衷。

translated by 谷歌翻译

Continual Learning via Local Module Composition

Oleksiy Ostapenko , Pau Rodriguez , Massimo Caccia , Laurent Charlin

分类：机器学习 | 人工智能

2021-11-15

模块化是持续学习（CL）的令人信服的解决方案，是相关任务建模的问题。学习和组合模块来解决不同的任务提供了一种抽象来解决CL的主要挑战，包括灾难性的遗忘，向后和向前传输跨任务以及子线性模型的增长。我们引入本地模块组成（LMC），该方法是模块化CL的方法，其中每个模块都提供了局部结构组件，其估计模块与输入的相关性。基于本地相关评分进行动态模块组合。我们展示了对任务身份（IDS）的不可知性来自（本地）结构学习，该结构学习是特定于模块和/或模型特定于以前的作品，使LMC适用于与以前的作品相比的更多CL设置。此外，LMC还跟踪输入分布的统计信息，并在检测到异常样本时添加新模块。在第一组实验中，LMC与最近的持续转移学习基准上的现有方法相比，不需要任务标识。在另一个研究中，我们表明结构学习的局部性允许LMC插入相关但未遵守的任务（OOD），以及在不同任务序列上独立于不同的任务序列培训的模块化网络，而无需任何微调。最后，在寻找LMC的限制，我们在30和100个任务的更具挑战性序列上研究它，展示了本地模块选择在存在大量候选模块时变得更具挑战性。在此设置中，与Oracle基准的基线相比，最佳执行LMC产生的模块更少，但它达到了较低的总体精度。 CodeBase可在https://github.com/oleksost/lmc下找到。

translated by 谷歌翻译

Learning to Prompt for Continual Learning

Zifeng Wang , Zizhao Zhang , Chen-Yu Lee , Han Zhang , Ruoxi Sun , Xiaoqi Ren , Guolong Su , Vincent Perot , Jennifer Dy , Tomas Pfister

分类：机器学习 | 计算机视觉

2021-12-16

持续学习背后的主流范例一直在使模型参数调整到非静止数据分布，灾难性遗忘是中央挑战。典型方法在测试时间依赖排练缓冲区或已知的任务标识，以检索学到的知识和地址遗忘，而这项工作呈现了一个新的范例，用于持续学习，旨在训练更加简洁的内存系统而不在测试时间访问任务标识。我们的方法学会动态提示（L2P）预先训练的模型，以在不同的任务转换下顺序地学习任务。在我们提出的框架中，提示是小型可学习参数，这些参数在内存空间中保持。目标是优化提示，以指示模型预测并明确地管理任务不变和任务特定知识，同时保持模型可塑性。我们在流行的图像分类基准下进行全面的实验，具有不同挑战的持续学习环境，其中L2P始终如一地优于现有最先进的方法。令人惊讶的是，即使没有排练缓冲区，L2P即使没有排练缓冲，L2P也能实现竞争力的结果，并直接适用于具有挑战性的任务不可行的持续学习。源代码在https://github.com/google-Research/l2p中获得。

translated by 谷歌翻译