最近,与“预训练,及时和预测”的新范式相比,与“预训练,微调”范式相比,新的范式“预训练,及时和预测”取得了显着的成就。在基于及时的GPT-3成功之后,一系列基于蒙版的语言模型(MLM)(例如Bert,Roberta)及时学习方法变得流行并广泛使用。但是,另一个有效的预训练的判别模型Electra可能被忽略了。在本文中,我们尝试使用拟议的替换代替令牌检测(RTD)基于基于的及时学习方法来完成零摄像的几个NLP任务。实验结果表明,基于RTD-Prompt学习的Electra模型可达到令人惊讶的最先进的零拍性能。在数字上,与MLM-Roberta-Large和MLM-Bert-Large相比,我们的RTD-Electra-Large在所有15个任务上平均提高了约8.4%和13.7%。特别是在SST-2任务上,我们的RTD-Electra-Large在没有任何培训数据的情况下达到了令人惊讶的90.1%精度。总体而言,与预先训练的蒙版语言模型相比,预先训练的代替令牌检测模型在零拍学习中的性能更好。因此,Electra是一位出色的零球学习者。源代码可在以下网址获得:https://github.com/nishiwen1214/rtd-electra。
translated by 谷歌翻译
预先训练的蒙版语言模型通过将下游任务作为文本填充来成功执行几次学习。但是,作为全镜头环境中的强大替代方案,诸如Electra之类的判别预训练模型不适合范式。在这项工作中,我们调整了基于及时的几次学习来进行电信,并表明它在广泛的任务中优于蒙面的语言模型。Electra是预先训练的,以区分令牌是产生还是原始。我们自然地将其扩展到基于迅速的几次学习,通过培训来评分目标选项的原创性,而无需引入新参数。我们的方法很容易适应涉及多token预测的任务,而无需额外的计算开销。分析表明,Electra学习分布与下游任务更好。
translated by 谷歌翻译
The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient. We present LM-BFF-better few-shot fine-tuning of language models 1 -a suite of simple and complementary techniques for finetuning language models on a small number of annotated examples. Our approach includes (1) prompt-based fine-tuning together with a novel pipeline for automating prompt generation; and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. Finally, we present a systematic evaluation for analyzing few-shot performance on a range of NLP tasks, including classification and regression. Our experiments demonstrate that our methods combine to dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30% absolute improvement, and 11% on average across all tasks. Our approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning. 2 * The first two authors contributed equally. 1 Alternatively, language models' best friends forever. 2 Our implementation is publicly available at https:// github.com/princeton-nlp/LM-BFF.
translated by 谷歌翻译
及时调整是将预训练模型调整到下游任务的极其有效的工具。但是,基于标准及时的方法主要考虑下游任务的足够数据的情况。目前尚不清楚是否可以将优势传输到几杆式制度,在每个下游任务中只有有限的数据。尽管有些作品证明了在几次弹奏设置下及时调整的潜力,但通过搜索离散提示或使用有限数据调整软提示的主流方法仍然非常具有挑战性。通过广泛的实证研究,我们发现迅速调整和完全微调之间的学习差距仍然存在差距。为了弥合差距,我们提出了一个新的及时调整框架,称为软模板调整(STT)。 STT结合了手册和自动提示,并将下游分类任务视为掩盖语言建模任务。对不同设置的全面评估表明,STT可以在不引入其他参数的情况下缩小微调和基于及时的方法之间的差距。值得注意的是,它甚至可以胜过情感分类任务的时间和资源消耗的微调方法。
translated by 谷歌翻译
已显示迅速学习可以在大多数文本分类任务中实现近调调节性能,但很少有培训示例。对于样品稀缺的NLP任务是有利的。在本文中,我们试图将其应用于实际情况,即恢复信息提取,并增强现有方法,以使其更适用于简历信息提取任务。特别是,我们根据简历的文本特征创建了多组手动模板和语言器。此外,我们比较了蒙版语言模型(MLM)预培训语言模型(PLM)和SEQ2SEQ PLM在此任务上的性能。此外,我们改进了口头设计的设计方法,用于知识渊博的及时调整,以便为其他基于应用程序的NLP任务的迅速模板和语言设计的设计提供了示例。在这种情况下,我们提出了手动知识渊博的语言器(MKV)的概念。构造与应用程序方案相对应的知识渊博的口头表的规则。实验表明,基于我们的规则设计的模板和言语器比现有的手动模板更有效,更强大,并自动生成及时方法。已经确定,当前可用的自动提示方法无法与手动设计的及时模板竞争一些现实的任务方案。最终混淆矩阵的结果表明,我们提出的MKV显着解决了样本不平衡问题。
translated by 谷歌翻译
大型预训练的语言模型(PLM)的最新进展导致了自然语言理解(NLU)任务的令人印象深刻的增长,并具有特定于任务的微调。但是,直接调整PLM在很大程度上依赖大量的标记实例,这些实例通常很难获得。迅速对PLM的调整已被证明对各种少数次任务很有价值。现有的作品研究基于迅速的NLU任务的基于及时的调整,主要集中于用语言器来得出正确的标签单词或生成及时的模板,以从PLM中启发语义。此外,还对常规数据增强方法进行了验证,可用于少量射击任务。但是,目前几乎没有针对基于及时的调整范式设计的数据增强方法。因此,我们研究了迅速的少数射击学习者的新数据增强问题。由于标签语义对于迅速的调整至关重要,因此我们提出了一种新颖的标签引导数据增强方法促进DA,该方法利用了丰富的标签语义信息以进行数据增强。很少的文本分类任务的广泛实验结果表明,我们提出的框架通过有效利用标签语义和数据扩展来实现自然语言理解来实现卓越的性能。
translated by 谷歌翻译
Masked language modeling (MLM) pre-training methods such as BERT corrupt the input by replacing some tokens with [MASK] and then train a model to reconstruct the original tokens. While they produce good results when transferred to downstream NLP tasks, they generally require large amounts of compute to be effective. As an alternative, we propose a more sample-efficient pre-training task called replaced token detection. Instead of masking the input, our approach corrupts it by replacing some tokens with plausible alternatives sampled from a small generator network. Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not. Thorough experiments demonstrate this new pre-training task is more efficient than MLM because the task is defined over all input tokens rather than just the small subset that was masked out. As a result, the contextual representations learned by our approach substantially outperform the ones learned by BERT given the same model size, data, and compute. The gains are particularly strong for small models; for example, we train a model on one GPU for 4 days that outperforms GPT (trained using 30x more compute) on the GLUE natural language understanding benchmark. Our approach also works well at scale, where it performs comparably to RoBERTa and XLNet while using less than 1/4 of their compute and outperforms them when using the same amount of compute.
translated by 谷歌翻译
How can we extend a pre-trained model to many language understanding tasks, without labeled or additional unlabeled data? Pre-trained language models (PLMs) have been effective for a wide range of NLP tasks. However, existing approaches either require fine-tuning on downstream labeled datasets or manually constructing proper prompts. In this paper, we propose nonparametric prompting PLM (NPPrompt) for fully zero-shot language understanding. Unlike previous methods, NPPrompt uses only pre-trained language models and does not require any labeled data or additional raw corpus for further fine-tuning, nor does it rely on humans to construct a comprehensive set of prompt label words. We evaluate NPPrompt against previous major few-shot and zero-shot learning methods on diverse NLP tasks: including text classification, text entailment, similar text retrieval, and paraphrasing. Experimental results demonstrate that our NPPrompt outperforms the previous best fully zero-shot method by big margins, with absolute gains of 12.8% in accuracy on text classification and 18.9% on the GLUE benchmark.
translated by 谷歌翻译
Pre-trained language models (PLMs) have exhibited remarkable few-shot learning capabilities when provided a few examples in a natural language prompt as demonstrations of test instances, i.e., in-context learning. However, the performance of in-context learning is susceptible to the choice of prompt format, training examples and the ordering of the training examples. In this paper, we propose a novel nearest-neighbor calibration framework for in-context learning to ease this issue. It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances, which provides a useful supervised signal to calibrate predictions. Thus, our method directly augments the predictions with a $k$-nearest-neighbor ($k$NN) classifier over a datastore of cached few-shot instance representations obtained by PLMs and their corresponding labels. Then adaptive neighbor selection and feature regularization modules are introduced to make full use of a few support instances to reduce the $k$NN retrieval noise. Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning, while even achieving comparable performance with state-of-the-art tuning-based approaches in some sentiment analysis tasks.
translated by 谷歌翻译
提示将下游应用程序作为语言建模任务施放,与使用预训练的模型进行标准微调相比,已显示出样本有效的效率。但是,提示的一个陷阱是需要手动设计的模式,其结果可能是不直觉的,需要大量的验证集来调整。为了应对挑战,我们提出了一种全自动提示方法Autoseq:(1)我们在序列到序列模型上采用自然语言提示,从而实现自由形式生成和更大的标签搜索空间; (2)我们提出了标签序列 - 无限长度的短语以口头表达标签 - 这消除了手动模板的需求,并且比单个标签单词更具有表现力; (3)我们使用Beam Search自动生成大量的标签序列候选物,并提出对比度重新排列以获得最佳组合。 Autoseq显着胜过其他无手动设计方法,例如软提示调整,适配器调整和自动搜索单个标签单词;生成的标签序列比各种任务上的精选手动序列更好。我们的方法揭示了几次学习中序列模型的潜力,并阐明了通用通用和自动提示的途径。本文的源代码可以从https://github.com/thunlp/seq2seq-prompt获得。
translated by 谷歌翻译
随着预训练的语言模型(PLM)的继续增长,精细调整PLM的硬件和数据要求也会增长。因此,研究人员提出了一种称为\ textit {提示学习}的较轻方法。但是,在调查过程中,我们观察到及时的学习方法是脆弱的,很容易被一些非法构造的提示攻击,从而导致分类错误和PLM的严重安全问题。当前的大多数研究都忽略了基于及时方法的安全问题。因此,在本文中,我们提出了一种恶意提示模板构建方法(\ textbf {stressAttack})来探测PLM的安全性能。研究了几种不友好的模板构建方法,以指导模型错误分类任务。在三个数据集和三个PLM上进行了广泛的实验证明了我们提出的方法提示的有效性。我们还进行实验,以验证我们的方法是否适用于几种镜头。
translated by 谷歌翻译
快速学习已成为现代自然语言处理的新范式,它直接适应培训的语言模型(PLMS)到$ CLOZE $ -Style预测,自回归建模或序列到序列生成,从而导致各种任务的表现。但是,尚未提出及时学习的标准实施框架,以及大多数现有的及时学习码条,通常是不受管制的,仅为特定方案提供有限的实现。由于有许多细节,例如模板策略,初始化策略和语言化策略等,因此需要在快速学习中考虑,从业者面临障碍,以便快速调整所需的迅速学习方法到他们的应用程序。在本文中,我们展示了{OpenPrompt},一个统一的易于使用的工具包,可以通过PLMS快速学习。 OpenPrompt是一项研究型框架,配备了效率,模块化和可扩展性,其组合性允许自由地将不同的PLMS,任务格式和提示模块组合在统一的范例中。用户可以宽松地部署快速学习框架,并在没有约束的情况下在不同的NLP任务上评估它们的泛化。 OpenPrompt在{\ url {https://github.com/thunlp/openprompt}}上公开发布。
translated by 谷歌翻译
迅速的学习方法通​​过诱导更好的几次表现,在他们仍然遵循基于参数的学习范式的同时,引起了自然语言处理的波动。学习中的遗忘和死记硬背的记忆问题可能会遇到不稳定的概括问题。具体而言,香草及时的学习可能难以利用死记硬背的非典型实例,在完全监督的培训或过度贴身模式的情况下使用低射击数据。为了减轻此类局限性,我们以将知识从记忆中解耦的动机发展为有助于模型在概括和记忆之间取得平衡。与香草及时学习相反,重新启动构造了培训实例中的开放式知识店,并在输入,培训和推理过程中实现检索机制,从而使该模型能够从培训语料库中检索相关环境作为能力为提示增强。广泛的实验表明,Retroppt可以在几次射击和零拍设置中获得更好的性能。此外,我们进一步说明,我们提出的撤退可以通过新数据集获得更好的概括能力。对记忆的详细分析确实显示逆转可以减少语言模型对记忆的依赖;因此,改善下游任务的概括。
translated by 谷歌翻译
We describe PromptBoosting, a query-efficient procedure for building a text classifier from a neural language model (LM) without access to the LM's parameters, gradients, or hidden representations. This form of "black-box" classifier training has become increasingly important as the cost of training and inference in large-scale LMs grows. But existing black-box LM classifier learning approaches are themselves computationally inefficient, typically specializing LMs to the target task by searching in a large space of (discrete or continuous) prompts using zeroth-order optimization methods. Instead of directly optimizing in prompt space, PromptBoosting obtains a small pool of prompts via a gradient-free approach and then constructs a large pool of weak learners by pairing these prompts with different elements of the LM's output distribution. These weak learners are then ensembled using the AdaBoost algorithm. The entire learning process requires only a small number of forward passes and no backward pass. Experiments show that PromptBoosting achieves state-of-the-art performance in multiple black-box few-shot classification tasks, and matches or outperforms full fine-tuning in both few-shot and standard learning paradigms, while training 10x faster than existing black-box methods.
translated by 谷歌翻译
示范学习旨在通过在少数射击设置中提供回答的演示来指导及时的预测。尽管取得了令人鼓舞的结果,但现有工作仅将回答的示例与及时模板(包括原始上下文)相连,而无需任何其他操作,从而忽略了迅速示意的依赖性。此外,先前的研究发现,随机替换示威的标签极小地损害了性能,这表明该模型无法正确地了解示威活动所带来的知识。受到人类学习过程的启发,在本文中,我们引入了模仿演示学习(模仿),以通过明确模仿人类审查行为来加强演示学习,其中包括:(1)对比度学习机制,以专注于类似的演示。 (2)证明标签重新预测方法以合并已知知识。实验结果表明,我们提出的方法在14个分类中心中有11个实现了最先进的性能。进一步的研究还证明,模仿 - demo加强了迅速与示威之间的关联,这可以为探索示范学习的工作方式提供基础。
translated by 谷歌翻译
Prompt learning recently become an effective linguistic tool to motivate the PLMs' knowledge on few-shot-setting tasks. However, studies have shown the lack of robustness still exists in prompt learning, since suitable initialization of continuous prompt and expert-first manual prompt are essential in fine-tuning process. What is more, human also utilize their comparative ability to motivate their existing knowledge for distinguishing different examples. Motivated by this, we explore how to use contrastive samples to strengthen prompt learning. In detail, we first propose our model ConsPrompt combining with prompt encoding network, contrastive sampling module, and contrastive scoring module. Subsequently, two sampling strategies, similarity-based and label-based strategies, are introduced to realize differential contrastive learning. The effectiveness of proposed ConsPrompt is demonstrated in five different few-shot learning tasks and shown the similarity-based sampling strategy is more effective than label-based in combining contrastive learning. Our results also exhibits the state-of-the-art performance and robustness in different few-shot settings, which proves that the ConsPrompt could be assumed as a better knowledge probe to motivate PLMs.
translated by 谷歌翻译
在本文中,我们描述了我们参与Case-2022的子任务1,即与休闲新闻语料库的事件因果关系识别。我们通过在少数带注释的示例(即几次配置)上利用一组简单但互补的技术来解决因果关系识别(CRI)任务。我们遵循一种基于迅速的预测方法,用于微调LMS,其中CRI任务被视为掩盖语言建模问题(MLM)。这种方法允许LMS在MLM问题上进行本地预先训练,可以直接生成对CRI特异性提示的文本响应。我们将此方法的性能与在整个数据集中训练的集合技术进行比较。我们表现​​最佳的提交仅接受了每班256个实例,整个数据集的一小部分培训,但能够获得第二好的精度(0.82),第三好的精度(0.82)和F1得分。 (0.85)非常接近获胜者团队(0.86)的报道。
translated by 谷歌翻译
预训练模型已在许多代码智能任务中有效。这些模型在大规模未标记的语料库中进行了预训练,然后在下游任务中进行了微调。但是,由于预训练和下游任务的输入是不同的形式,因此很难充分探索预训练模型的知识。此外,微调的性能强烈依赖于下游数据的量,而实际上,具有稀缺数据的场景很常见。自然语言处理(NLP)领域的最新研究表明,迅速调整,一种调整的新范式,减轻上述问题并在各种NLP任务中实现了有希望的结果。在迅速调整中,在调整过程中插入的提示提供了特定于任务的知识,这对于具有相对较少数据的任务特别有益。在本文中,我们凭经验评估了代码智能任务中迅速调整的用法和效果。我们对流行的预训练模型Codebert和codet5进行及时调整,并尝试三个代码智能任务,包括缺陷预测,代码摘要和代码翻译。我们的实验结果表明,在所有三个任务中,迅速调整始终优于微调。此外,及时调整在低资源场景中显示出很大的潜力,例如,对于代码摘要,平均将微调的BLEU分数提高了26%以上。我们的结果表明,我们可以调整代码智能任务的迅速调整,以实现更好的性能,尤其是在缺乏特定于任务的数据时,我们可以调整及时调整。
translated by 谷歌翻译
非常大的预培训的语言模型(PTM)(如GPT-3)通常被释放为服务,允许用户设计特定于任务的提示以通过一些黑盒API查询PTMS。在这样的场景中,我们调用语言模型 - AS-Service(LMAAS),PTM的梯度通常不可用。我们可以通过仅访问模型推断API来优化任务提示吗?基于最近的观察结果,大型PTMS具有非常低的内在维度,这项工作提出了黑匣子调谐,通过无衍生算法优化PTM。特别是,我们通过迭代调用PTM推断API来调用CMA-es以优化预先提示的连续提示。我们的实验结果表明,黑匣子调整罗伯塔在少数标签样本上不仅显着优于手动提示和GPT-3的上下文学习,而且还超越了基于梯度的对应物,即提示调整和完整的模型调整。
translated by 谷歌翻译
With increasing scale, large language models demonstrate both quantitative improvement and new qualitative capabilities, especially as zero-shot learners, like GPT-3. However, these results rely heavily on delicate prompt design and large computation. In this work, we explore whether the strong zero-shot ability could be achieved at a smaller model scale without any external supervised data. To achieve this goal, we revisit masked language modeling and present a geometry-guided self-supervised learning method (Go-tuningfor short) by taking a small number of task-aware self-supervised data to update language models further. Experiments show that Go-tuning can enable T5-small (80M) competitive zero-shot results compared with large language models, such as T5-XL (3B). We also apply Go-tuning on multi-task settings and develop a multi-task model, mgo-T5 (250M). It can reach the average performance of OPT (175B) on 9 datasets.
translated by 谷歌翻译