大型预训练的语言模型(PLM)的最新进展导致了自然语言理解(NLU)任务的令人印象深刻的增长,并具有特定于任务的微调。但是,直接调整PLM在很大程度上依赖大量的标记实例,这些实例通常很难获得。迅速对PLM的调整已被证明对各种少数次任务很有价值。现有的作品研究基于迅速的NLU任务的基于及时的调整,主要集中于用语言器来得出正确的标签单词或生成及时的模板,以从PLM中启发语义。此外,还对常规数据增强方法进行了验证,可用于少量射击任务。但是,目前几乎没有针对基于及时的调整范式设计的数据增强方法。因此,我们研究了迅速的少数射击学习者的新数据增强问题。由于标签语义对于迅速的调整至关重要,因此我们提出了一种新颖的标签引导数据增强方法促进DA,该方法利用了丰富的标签语义信息以进行数据增强。很少的文本分类任务的广泛实验结果表明,我们提出的框架通过有效利用标签语义和数据扩展来实现自然语言理解来实现卓越的性能。
translated by 谷歌翻译
The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient. We present LM-BFF-better few-shot fine-tuning of language models 1 -a suite of simple and complementary techniques for finetuning language models on a small number of annotated examples. Our approach includes (1) prompt-based fine-tuning together with a novel pipeline for automating prompt generation; and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. Finally, we present a systematic evaluation for analyzing few-shot performance on a range of NLP tasks, including classification and regression. Our experiments demonstrate that our methods combine to dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30% absolute improvement, and 11% on average across all tasks. Our approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning. 2 * The first two authors contributed equally. 1 Alternatively, language models' best friends forever. 2 Our implementation is publicly available at https:// github.com/princeton-nlp/LM-BFF.
translated by 谷歌翻译
最近,与“预训练,及时和预测”的新范式相比,与“预训练,微调”范式相比,新的范式“预训练,及时和预测”取得了显着的成就。在基于及时的GPT-3成功之后,一系列基于蒙版的语言模型(MLM)(例如Bert,Roberta)及时学习方法变得流行并广泛使用。但是,另一个有效的预训练的判别模型Electra可能被忽略了。在本文中,我们尝试使用拟议的替换代替令牌检测(RTD)基于基于的及时学习方法来完成零摄像的几个NLP任务。实验结果表明,基于RTD-Prompt学习的Electra模型可达到令人惊讶的最先进的零拍性能。在数字上,与MLM-Roberta-Large和MLM-Bert-Large相比,我们的RTD-Electra-Large在所有15个任务上平均提高了约8.4%和13.7%。特别是在SST-2任务上,我们的RTD-Electra-Large在没有任何培训数据的情况下达到了令人惊讶的90.1%精度。总体而言,与预先训练的蒙版语言模型相比,预先训练的代替令牌检测模型在零拍学习中的性能更好。因此,Electra是一位出色的零球学习者。源代码可在以下网址获得:https://github.com/nishiwen1214/rtd-electra。
translated by 谷歌翻译
迅速的学习方法通​​过诱导更好的几次表现,在他们仍然遵循基于参数的学习范式的同时,引起了自然语言处理的波动。学习中的遗忘和死记硬背的记忆问题可能会遇到不稳定的概括问题。具体而言,香草及时的学习可能难以利用死记硬背的非典型实例,在完全监督的培训或过度贴身模式的情况下使用低射击数据。为了减轻此类局限性,我们以将知识从记忆中解耦的动机发展为有助于模型在概括和记忆之间取得平衡。与香草及时学习相反,重新启动构造了培训实例中的开放式知识店,并在输入,培训和推理过程中实现检索机制,从而使该模型能够从培训语料库中检索相关环境作为能力为提示增强。广泛的实验表明,Retroppt可以在几次射击和零拍设置中获得更好的性能。此外,我们进一步说明,我们提出的撤退可以通过新数据集获得更好的概括能力。对记忆的详细分析确实显示逆转可以减少语言模型对记忆的依赖;因此,改善下游任务的概括。
translated by 谷歌翻译
How can we extend a pre-trained model to many language understanding tasks, without labeled or additional unlabeled data? Pre-trained language models (PLMs) have been effective for a wide range of NLP tasks. However, existing approaches either require fine-tuning on downstream labeled datasets or manually constructing proper prompts. In this paper, we propose nonparametric prompting PLM (NPPrompt) for fully zero-shot language understanding. Unlike previous methods, NPPrompt uses only pre-trained language models and does not require any labeled data or additional raw corpus for further fine-tuning, nor does it rely on humans to construct a comprehensive set of prompt label words. We evaluate NPPrompt against previous major few-shot and zero-shot learning methods on diverse NLP tasks: including text classification, text entailment, similar text retrieval, and paraphrasing. Experimental results demonstrate that our NPPrompt outperforms the previous best fully zero-shot method by big margins, with absolute gains of 12.8% in accuracy on text classification and 18.9% on the GLUE benchmark.
translated by 谷歌翻译
Pre-trained language models (PLMs) have exhibited remarkable few-shot learning capabilities when provided a few examples in a natural language prompt as demonstrations of test instances, i.e., in-context learning. However, the performance of in-context learning is susceptible to the choice of prompt format, training examples and the ordering of the training examples. In this paper, we propose a novel nearest-neighbor calibration framework for in-context learning to ease this issue. It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances, which provides a useful supervised signal to calibrate predictions. Thus, our method directly augments the predictions with a $k$-nearest-neighbor ($k$NN) classifier over a datastore of cached few-shot instance representations obtained by PLMs and their corresponding labels. Then adaptive neighbor selection and feature regularization modules are introduced to make full use of a few support instances to reduce the $k$NN retrieval noise. Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning, while even achieving comparable performance with state-of-the-art tuning-based approaches in some sentiment analysis tasks.
translated by 谷歌翻译
提示将下游应用程序作为语言建模任务施放,与使用预训练的模型进行标准微调相比,已显示出样本有效的效率。但是,提示的一个陷阱是需要手动设计的模式,其结果可能是不直觉的,需要大量的验证集来调整。为了应对挑战,我们提出了一种全自动提示方法Autoseq:(1)我们在序列到序列模型上采用自然语言提示,从而实现自由形式生成和更大的标签搜索空间; (2)我们提出了标签序列 - 无限长度的短语以口头表达标签 - 这消除了手动模板的需求,并且比单个标签单词更具有表现力; (3)我们使用Beam Search自动生成大量的标签序列候选物,并提出对比度重新排列以获得最佳组合。 Autoseq显着胜过其他无手动设计方法,例如软提示调整,适配器调整和自动搜索单个标签单词;生成的标签序列比各种任务上的精选手动序列更好。我们的方法揭示了几次学习中序列模型的潜力,并阐明了通用通用和自动提示的途径。本文的源代码可以从https://github.com/thunlp/seq2seq-prompt获得。
translated by 谷歌翻译
Prompt learning recently become an effective linguistic tool to motivate the PLMs' knowledge on few-shot-setting tasks. However, studies have shown the lack of robustness still exists in prompt learning, since suitable initialization of continuous prompt and expert-first manual prompt are essential in fine-tuning process. What is more, human also utilize their comparative ability to motivate their existing knowledge for distinguishing different examples. Motivated by this, we explore how to use contrastive samples to strengthen prompt learning. In detail, we first propose our model ConsPrompt combining with prompt encoding network, contrastive sampling module, and contrastive scoring module. Subsequently, two sampling strategies, similarity-based and label-based strategies, are introduced to realize differential contrastive learning. The effectiveness of proposed ConsPrompt is demonstrated in five different few-shot learning tasks and shown the similarity-based sampling strategy is more effective than label-based in combining contrastive learning. Our results also exhibits the state-of-the-art performance and robustness in different few-shot settings, which proves that the ConsPrompt could be assumed as a better knowledge probe to motivate PLMs.
translated by 谷歌翻译
几乎没有命名的实体识别(NER)对于在有限的资源领域中标记的实体标记至关重要,因此近年来受到了适当的关注。现有的几声方法主要在域内设置下进行评估。相比之下,对于这些固有的忠实模型如何使用一些标记的域内示例在跨域NER中执行的方式知之甚少。本文提出了一种两步以理性为中心的数据增强方法,以提高模型的泛化能力。几个数据集中的结果表明,与先前的最新方法相比,我们的模型无形方法可显着提高跨域NER任务的性能,包括反事实数据增强和及时调用方法。我们的代码可在\ url {https://github.com/lifan-yuan/factmix}上获得。
translated by 谷歌翻译
我们提出了Patron,这是一种新方法,它使用基于及时的不确定性估计,用于在冷启动场景下进行预训练的语言模型进行微调的数据选择,即,没有初始标记的数据可用。在顾客中,我们设计(1)一种基于迅速的不确定性传播方法来估计数据点的重要性和(2)分区 - 然后 - 剥离(PTR)策略,以促进对注释的样品多样性。六个文本分类数据集的实验表明,赞助人的表现优于最强的冷启动数据选择基准,高达6.9%。此外,仅具有128个标签,顾客分别基于香草微调和及时的学习,获得了91.0%和92.1%的全面监督性能。我们的赞助人实施可在\ url {https://github.com/yueyu1030/patron}上获得。
translated by 谷歌翻译
及时调整是将预训练模型调整到下游任务的极其有效的工具。但是,基于标准及时的方法主要考虑下游任务的足够数据的情况。目前尚不清楚是否可以将优势传输到几杆式制度,在每个下游任务中只有有限的数据。尽管有些作品证明了在几次弹奏设置下及时调整的潜力,但通过搜索离散提示或使用有限数据调整软提示的主流方法仍然非常具有挑战性。通过广泛的实证研究,我们发现迅速调整和完全微调之间的学习差距仍然存在差距。为了弥合差距,我们提出了一个新的及时调整框架,称为软模板调整(STT)。 STT结合了手册和自动提示,并将下游分类任务视为掩盖语言建模任务。对不同设置的全面评估表明,STT可以在不引入其他参数的情况下缩小微调和基于及时的方法之间的差距。值得注意的是,它甚至可以胜过情感分类任务的时间和资源消耗的微调方法。
translated by 谷歌翻译
Recent studies have revealed the intriguing few-shot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-specific annotations. Despite their promising performance, most existing few-shot approaches that only learn from the small training set still underperform fully supervised training by nontrivial margins. In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set. To encourage the generator to produce label-discriminative samples, we train it via weighted maximum likelihood where the weight of each token is automatically adjusted based on a discriminative meta-learning objective. A classification PLM can then be fine-tuned on both the few-shot and the synthetic samples with regularization for better generalization and stability. Our approach FewGen achieves an overall better result across seven classification tasks of the GLUE benchmark than existing few-shot learning methods, improving no-augmentation methods by 5+ average points, and outperforming augmentation methods by 3+ average points.
translated by 谷歌翻译
GPT-3等大型语言模型是优秀的几次学习者,允许他们通过自然文本提示来控制。最近的研究报告称,基于及时的直接分类消除了对微调的需求,但缺乏数据和推理可扩展性。本文提出了一种新的数据增强技术,利用大规模语言模型来生成来自真实样本的混合的现实文本样本。我们还建议利用语言模型预测的软标签,从大规模语言模型中有效地蒸馏知识并同时创建文本扰动。我们对各种分类任务进行数据增强实验,并显示我们的方法非常优于现有的文本增强方法。消融研究和定性分析为我们的方法提供了更多的见解。
translated by 谷歌翻译
Natural language prompts have been shown to facilitate cross-task generalization for large language models. However, with no or limited labeled examples, the cross-task performance is highly sensitive to the choice of prompts, while selecting a high-performing prompt is challenging given the scarcity of labels. To address the issue, we propose a Zero-Label Prompt Selection (ZPS) method that selects prompts without any labeled data or gradient update. Specifically, given the candidate human-written prompts for a task, ZPS labels a set of unlabeled data with a prompt ensemble and uses the pseudo-labels for prompt selection. Experiments show that ZPS improves over prior methods by a sizeable margin in zero-label performance. We also extend ZPS to a few-shot setting and show its advantages over strong baselines such as prompt tuning and model tuning.
translated by 谷歌翻译
预先训练的蒙版语言模型通过将下游任务作为文本填充来成功执行几次学习。但是,作为全镜头环境中的强大替代方案,诸如Electra之类的判别预训练模型不适合范式。在这项工作中,我们调整了基于及时的几次学习来进行电信,并表明它在广泛的任务中优于蒙面的语言模型。Electra是预先训练的,以区分令牌是产生还是原始。我们自然地将其扩展到基于迅速的几次学习,通过培训来评分目标选项的原创性,而无需引入新参数。我们的方法很容易适应涉及多token预测的任务,而无需额外的计算开销。分析表明,Electra学习分布与下游任务更好。
translated by 谷歌翻译
We introduce TeSS (Text Similarity Comparison using Sentence Encoder), a framework for zero-shot classification where the assigned label is determined by the embedding similarity between the input text and each candidate label prompt. We leverage representations from sentence encoders optimized to locate semantically similar samples closer to each other in embedding space during pre-training. The label prompt embeddings serve as prototypes of their corresponding class clusters. Furthermore, to compensate for the potentially poorly descriptive labels in their original format, we retrieve semantically similar sentences from external corpora and additionally use them with the original label prompt (TeSS-R). TeSS outperforms strong baselines on various closed-set and open-set classification datasets under zero-shot setting, with further gains when combined with label prompt diversification through retrieval. These results are robustly attained to verbalizer variations, an ancillary benefit of using a bi-encoder. Altogether, our method serves as a reliable baseline for zero-shot classification and a simple interface to assess the quality of sentence encoders.
translated by 谷歌翻译
我们研究了很少的细粒实体键入(FET)的问题,其中只有几个带注释的实体对每种实体类型提供了上下文。最近,基于及时的调整通过将实体类型分类任务作为“填补空白”的问题来表明在几次射击方案中表现出优越的性能。这允许有效利用预训练的语言模型(PLM)的强语建模能力。尽管当前基于及时的调整方法成功了,但仍有两个主要挑战:(1)提示中的口头化器要么是由外部知识基础手动设计或构建的,而无需考虑目标语料库和标签层次结构信息,而且(2)当前方法主要利用PLM的表示能力,但没有通过广泛的通用域预训练来探索其产生的功率。在这项工作中,我们为由两个模块组成的几个弹药fet提出了一个新颖的框架:(1)实体类型标签解释模块自动学习将类型标签与词汇联系起来,通过共同利用几个播放实例和标签层次结构和标签层次结构,以及(2)基于类型的上下文化实例生成器根据给定实例生成新实例,以扩大培训集以更好地概括。在三个基准数据集上,我们的模型优于大量利润的现有方法。可以在https://github.com/teapot123/fine-graining-entity-typing上找到代码。
translated by 谷歌翻译
在本文中,我们描述了我们参与Case-2022的子任务1,即与休闲新闻语料库的事件因果关系识别。我们通过在少数带注释的示例(即几次配置)上利用一组简单但互补的技术来解决因果关系识别(CRI)任务。我们遵循一种基于迅速的预测方法,用于微调LMS,其中CRI任务被视为掩盖语言建模问题(MLM)。这种方法允许LMS在MLM问题上进行本地预先训练,可以直接生成对CRI特异性提示的文本响应。我们将此方法的性能与在整个数据集中训练的集合技术进行比较。我们表现​​最佳的提交仅接受了每班256个实例,整个数据集的一小部分培训,但能够获得第二好的精度(0.82),第三好的精度(0.82)和F1得分。 (0.85)非常接近获胜者团队(0.86)的报道。
translated by 谷歌翻译
已显示迅速学习可以在大多数文本分类任务中实现近调调节性能,但很少有培训示例。对于样品稀缺的NLP任务是有利的。在本文中,我们试图将其应用于实际情况,即恢复信息提取,并增强现有方法,以使其更适用于简历信息提取任务。特别是,我们根据简历的文本特征创建了多组手动模板和语言器。此外,我们比较了蒙版语言模型(MLM)预培训语言模型(PLM)和SEQ2SEQ PLM在此任务上的性能。此外,我们改进了口头设计的设计方法,用于知识渊博的及时调整,以便为其他基于应用程序的NLP任务的迅速模板和语言设计的设计提供了示例。在这种情况下,我们提出了手动知识渊博的语言器(MKV)的概念。构造与应用程序方案相对应的知识渊博的口头表的规则。实验表明,基于我们的规则设计的模板和言语器比现有的手动模板更有效,更强大,并自动生成及时方法。已经确定,当前可用的自动提示方法无法与手动设计的及时模板竞争一些现实的任务方案。最终混淆矩阵的结果表明,我们提出的MKV显着解决了样本不平衡问题。
translated by 谷歌翻译
Contrastive learning has become a new paradigm for unsupervised sentence embeddings. Previous studies focus on instance-wise contrastive learning, attempting to construct positive pairs with textual data augmentation. In this paper, we propose a novel Contrastive learning method with Prompt-derived Virtual semantic Prototypes (ConPVP). Specifically, with the help of prompts, we construct virtual semantic prototypes to each instance, and derive negative prototypes by using the negative form of the prompts. Using a prototypical contrastive loss, we enforce the anchor sentence embedding to be close to its corresponding semantic prototypes, and far apart from the negative prototypes as well as the prototypes of other sentences. Extensive experimental results on semantic textual similarity, transfer, and clustering tasks demonstrate the effectiveness of our proposed model compared to strong baselines. Code is available at https://github.com/lemon0830/promptCSE.
translated by 谷歌翻译