We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and Con-ceptNet (Speer et al., 2017). Contrary to many conventional KBs that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward automatic commonsense completion is the development of generative models of commonsense knowledge, and propose COMmonsEnse Transformers (COMET ) that learn to generate rich and diverse commonsense descriptions in natural language. Despite the challenges of commonsense modeling, our investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs. Empirical results demonstrate that COMET is able to generate novel knowledge that humans rate as high quality, with up to 77.5% (ATOMIC) and 91.7% (ConceptNet) precision at top 1, which approaches human performance for these resources. Our findings suggest that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.
translated by 谷歌翻译
近年来带来了对自然语言理解领域的勤义代表和推理的重新兴趣。新的致辞知识图表(CSKG)的发展是这些进步的核心,因为他们的不同事实可以通过机器学习模型来解决新的和具有挑战性的任务。与此同时,由于全面地涵盖了一般勤杂朗知识所需的大规模规模,对这些资源的质量和覆盖率仍存在疑问。在这项工作中,我们将手动构建的CSKGS分配在NLP代理商遇到的所有情况下,我们将永远不会实现适用所需的覆盖范围。因此,我们提出了一种新的评估框架,用于测试KGS的效用,基于如何从中学习有效的隐式知识表示。通过这一新目标,我们提出了一个含有知识的全新CSKG的新CSKG,该知识不容易获得预用的语言模型。我们与其他领先的CSKG相比,评估其属性,表现了对勤杂朗语言知识资源的第一个大规模对研究。接下来,我们显示原子2020更适合培训知识模型,可以为新的,看不见的实体和事件产生准确,代表知识。最后,通过人类评估,我们表明,尽管使用超过430倍的参数,但GPT-3(175B参数)的几次射击性能较低,而令人印象深刻,令人印象深刻,令人印象深刻,令人印象深刻,仍然低于原子型2020的巴特的知识模型。
translated by 谷歌翻译
We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge. Compared to existing resources that center around taxonomic knowledge, ATOMIC focuses on inferential knowledge organized as typed if-then relations with variables (e.g., "if X pays Y a compliment, then Y will likely return the compliment"). We propose nine if-then relation types to distinguish causes vs. effects, agents vs. themes, voluntary vs. involuntary events, and actions vs. mental states. By generatively training on the rich inferential knowledge described in ATOMIC, we show that neural models can acquire simple commonsense capabilities and reason about previously unseen events. Experimental results demonstrate that multitask models that incorporate the hierarchical structure of if-then relation types lead to more accurate inference compared to models trained in isolation, as measured by both automatic and human evaluation.
translated by 谷歌翻译
The common practice for training commonsense models has gone from-human-to-corpus-to-machine: humans author commonsense knowledge graphs in order to train commonsense models. In this work, we investigate an alternative, from-machine-to-corpus-to-machine: general language models author these commonsense knowledge graphs to train commonsense models. Our study leads to a new framework, Symbolic Knowledge Distillation. As with prior art in Knowledge Distillation (Hinton et al., 2015), our approach uses larger models to teach smaller models. A key difference is that we distill knowledge symbolically-as text-in addition to the neural model. We also distill only one aspect-the commonsense of a general language model teacher, allowing the student to be a different type, a commonsense model. Altogether, we show that careful prompt engineering and a separately trained critic model allow us to selectively distill high-quality causal commonsense from GPT-3, a general language model. Empirical results demonstrate that, for the first time, a human-authored commonsense knowledge graph is surpassed by our automatically distilled variant in all three criteria: quantity, quality, and diversity. In addition, it results in a neural commonsense model that surpasses the teacher model's commonsense capabilities despite its 100x smaller size. We apply this to the ATOMIC resource, and share our new symbolic knowledge graph and commonsense models.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
预训练的语言模型(PTLM)已显示出在自然语言任务上表现良好。许多先前的作品都以通过知识图(KGS)标记的关系链接的实体的形式利用结构性常识来协助PTLM。检索方法使用kg作为单独的静态模块,该模块限制了覆盖范围,因为kgs包含有限的知识。生成方法训练PTLMS kg三倍以提高获得知识的规模。但是,对符号KG实体的培训限制了其在涉及自然语言文本的任务中的适用性,在这些任务中,它们忽略了整体上下文。为了减轻这种情况,我们提出了一个以句子为条件的常识性上下文化器(COSE-CO)作为输入,以使其在生成与输入文本的整体上下文相关的任务中通常可用。为了训练Cose-Co,我们提出了一个新的数据集,其中包括句子和常识知识对。 COSE-CO推断出的知识是多种多样的,并且包含了基础KG中不存在的新实体。我们增强了在多选质量质量检查和开放式常识性推理任务中产生的知识,从而改善了CSQA,ARC,QASC和OBQA数据集的当前最佳方法。我们还展示了其在改善释义生成任务的基线模型方面的适用性。
translated by 谷歌翻译
自动化讲故事长期以来一直抓住了研究人员在日常生活中的叙述中的难以感受。但是,在用神经语言模型产生叙述时,保持一致性并保持对特定结束的特定结束挑战。在本文中,我们介绍了读者模型(Storm)的故事生成,这是一个框架,其中读者模型用于推理故事的推理应该进步。读者模型是人类读者相信关于虚构故事世界的概念,实体和关系的人。我们展示了如何作为知识图表所代表的明确读者模型提供故事一致性,并以实现给定的故事世界目标的形式提供可控性。实验表明,我们的模型产生了显着更加连贯和主题的故事,优于尺寸的基线,包括情节合理性并保持主题。我们的系统也优于在未订购的情况下在组成给定概念时占总引导的故事生成基线。
translated by 谷歌翻译
符号知识图(kgs)是通过昂贵的人众包或特定于域特异性的复杂信息提取管道来构建的。诸如BERT之类的新兴大型语言模型(LMS)已显示出隐式编码的大量知识,可以使用正确设计的提示来查询。但是,与明确的公斤相比,黑盒LMS中的知识通常很难访问或编辑,并且缺乏解释性。在这项工作中,我们旨在从LMS收获符号KG,这是一个由神经LMS的灵活性和可扩展性增强的自动kg构造的新框架。与通常依赖大型人类注释的数据或现有大量KG的先前作品相比,我们的方法仅需要对关系的最小定义作为输入,因此适合于以前无法提取有关丰富新关系的知识。该方法会自动生成多样化的提示,并在给定的LM内执行有效的知识搜索,以进行一致和广泛的输出。与以前的方法相比,使用我们的方法收获的知识要准确得多,如自动和人类评估所示。结果,我们源于多元化的LMS,一个新的KG家族(例如Bertnet和Robertanet),其中包含一套更丰富的常识关系,包括复杂的关系(例如,A对B的能力,但不擅长B”)人类注销的kg(例如概念网)。此外,由此产生的kg也是解释各自的源LMS的工具,从而导致对不同LMS不同知识能力的新见解。
translated by 谷歌翻译
Storytelling and narrative are fundamental to human experience, intertwined with our social and cultural engagement. As such, researchers have long attempted to create systems that can generate stories automatically. In recent years, powered by deep learning and massive data resources, automatic story generation has shown significant advances. However, considerable challenges, like the need for global coherence in generated stories, still hamper generative models from reaching the same storytelling ability as human narrators. To tackle these challenges, many studies seek to inject structured knowledge into the generation process, which is referred to as structure knowledge-enhanced story generation. Incorporating external knowledge can enhance the logical coherence among story events, achieve better knowledge grounding, and alleviate over-generalization and repetition problems in stories. This survey provides the latest and comprehensive review of this research field: (i) we present a systematical taxonomy regarding how existing methods integrate structured knowledge into story generation; (ii) we summarize involved story corpora, structured knowledge datasets, and evaluation metrics; (iii) we give multidimensional insights into the challenges of knowledge-enhanced story generation and cast light on promising directions for future study.
translated by 谷歌翻译
In this paper, we present kogito, an open-source tool for generating commonsense inferences about situations described in text. kogito provides an intuitive and extensible interface to interact with natural language generation models that can be used for hypothesizing commonsense knowledge inference from a textual input. In particular, kogito offers several features for targeted, multi-granularity knowledge generation. These include a standardized API for training and evaluating knowledge models, and generating and filtering inferences from them. We also include helper functions for converting natural language texts into a format ingestible by knowledge models - intermediate pipeline stages such as knowledge head extraction from text, heuristic and model-based knowledge head-relation matching, and an ability to define and use custom knowledge relations. We make the code for kogito available at https://github.com/epfl-nlp/kogito along with thorough documentation at https://kogito.readthedocs.io.
translated by 谷歌翻译
在本文中,我们建议利用对话的独特特征,共享参与者的常识性知识,以解决总结它们的困难。我们提出了病态的框架,该框架使用常识推论作为其他背景。与以前仅依赖于输入对话的工作相比,Sick使用外部知识模型来生成丰富的常识推断,并选择具有基于相似性选择方法的最可能的推理。基于生病的,病人++的理解为监督,在总结多任务学习环境中的对话时,添加了产生常识推断的任务。实验结果表明,通过注入常识性知识,我们的框架比现有方法产生更多信息和一致的摘要。
translated by 谷歌翻译
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fillin-the-blank" cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-theart pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https: //github.com/facebookresearch/LAMA.
translated by 谷歌翻译
对事件序列的预测对于信息检索和自然语言处理中的许多现实世界应用至关重要。在事件序列预测中,未来的活动生成(FEG)是一项具有挑战性的任务,因为它不仅需要流利的文本生成,而且需要常识性推理才能保持整个事件故事的逻辑连贯性。在本文中,我们提出了一个新颖的可解释的FEG框架COEP。它突出并整合了两种类型的事件知识,对直接事件事件关系的顺序知识以及推论知识,这些知识反映了事件之间的中间角色心理学(例如意图,原因,反应),这些心理本质地将故事推向了故事。为了减轻知识遗忘问题,我们为每种类型的知识设计了两个模块,即IM和GM,它们是通过及时调整组合的。首先,IM专注于理解推论知识,以产生常识性解释并为通用汽车提供软提示向量。我们还设计了一种对比歧视器,以提高概括能力。其次,GM通过用IM的指导对直接顺序知识进行建模来生成未来事件。自动和人类评估表明,我们的方法可以产生更连贯,具体和逻辑的未来事件。
translated by 谷歌翻译
关于概念及其属性的常识知识(CSK)有助于AI应用程序。诸如ConceptNet之类的先前作品已经编译了大型CSK集合。但是,它们的表现力限制在主题性 - 预处理(SPO)的三联元中,对p和o的s和字符串的简单概念。与先前的作品相比,CSK断言具有精致的表现力和更好的精度和回忆。 Ascent ++通过用子组和方面捕获复合概念,以及用语义方面的主张来捕获复合概念。后者对于表达断言和进一步预选赛的时间和空间有效性至关重要。此外,Ascent ++将开放信息提取(OpenIE)与典型性和显着性分数的明智清洁和排名相结合。对于高覆盖范围,我们的方法挖掘到具有广泛的Web内容的大规模爬网C4中。通过人类判断的评估显示了上升++ Kb的卓越质量,以及对QA支持任务的外部评估强调了Ascent ++的好处。可以在https://ascentpp.mpi-inf.mpg.de/上访问Web界面,数据和代码。
translated by 谷歌翻译
When answering a question, people often draw upon their rich world knowledge in addition to the particular context. Recent work has focused primarily on answering questions given some relevant document or context, and required very little general background. To investigate question answering with prior knowledge, we present COMMONSENSEQA: a challenging new dataset for commonsense question answering. To capture common sense beyond associations, we extract from CON-CEPTNET (Speer et al., 2017) multiple target concepts that have the same semantic relation to a single source concept. Crowd-workers are asked to author multiple-choice questions that mention the source concept and discriminate in turn between each of the target concepts. This encourages workers to create questions with complex semantics that often require prior knowledge. We create 12,247 questions through this procedure and demonstrate the difficulty of our task with a large number of strong baselines. Our best baseline is based on BERT-large (Devlin et al., 2018) and obtains 56% accuracy, well below human performance, which is 89%.
translated by 谷歌翻译
Pre-trained language models, despite their rapid advancements powered by scale, still fall short of robust commonsense capabilities. And yet, scale appears to be the winning recipe; after all, the largest models seem to have acquired the largest amount of commonsense capabilities. Or is it? In this paper, we investigate the possibility of a seemingly impossible match: can smaller language models with dismal commonsense capabilities (i.e., GPT-2), ever win over models that are orders of magnitude larger and better (i.e., GPT-3), if the smaller models are powered with novel commonsense distillation algorithms? The key intellectual question we ask here is whether it is possible, if at all, to design a learning algorithm that does not benefit from scale, yet leads to a competitive level of commonsense acquisition. In this work, we study the generative models of commonsense knowledge, focusing on the task of generating generics, statements of commonsense facts about everyday concepts, e.g., birds can fly. We introduce a novel commonsense distillation framework, I2D2, that loosely follows the Symbolic Knowledge Distillation of West et al. but breaks the dependence on the extreme-scale models as the teacher model by two innovations: (1) the novel adaptation of NeuroLogic Decoding to enhance the generation quality of the weak, off-the-shelf language models, and (2) self-imitation learning to iteratively learn from the model's own enhanced commonsense acquisition capabilities. Empirical results suggest that scale is not the only way, as novel algorithms can be a promising alternative. Moreover, our study leads to a new corpus of generics, Gen-A-Tomic, that is of the largest and highest quality available to date.
translated by 谷歌翻译
本文对过去二十年来对自然语言生成(NLG)的研究提供了全面的审查,特别是与数据到文本生成和文本到文本生成深度学习方法有关,以及NLG的新应用技术。该调查旨在(a)给出关于NLG核心任务的最新综合,以及该领域采用的建筑;(b)详细介绍各种NLG任务和数据集,并提请注意NLG评估中的挑战,专注于不同的评估方法及其关系;(c)强调一些未来的强调和相对近期的研究问题,因为NLG和其他人工智能领域的协同作用而增加,例如计算机视觉,文本和计算创造力。
translated by 谷歌翻译
在本文中,我们提出了Tetris,这是一个面向目标脚本完成的新任务。与以前的工作不同,它考虑了一个更现实,更通用的设置,其中输入不仅包括目标,还包括其他用户上下文,包括偏好和历史记录。为了使用基于知识的方法解决问题,我们介绍了任务概念图,这是一种自动从教学网站构建的知识库。不同于常识知识基础(例如ConceptNet),任务概念图架构架构介绍了专门用于完成任务的各种基于名词短语的节点。为了将这些图形集成到脚本学习中,我们设计了两种从知识库中获取概念的方法,以作为下游脚本完成的提示。在我们的基于Wikihow的数据集中,我们发现从任务概念图中合并概念会始终提高性能,并证明任务概念图的好处。此外,具有金色标准概念的模型迅速胜过基线,进一步证实了在目标脚本完成中对特定于任务知识的需求。数据集,存储库,模型和演示将公开使用,以促进对这项新任务的进一步研究。
translated by 谷歌翻译
大型预先训练的生成语言模型的出现为AI故事的常见框架通过采样模型来创建持续故事的序列。然而,单独的抽样对故事产生不足。特别是,很难指导语言模型来创建故事以达到特定的目标事件。我们提出了两种在深增强学习和奖励塑造的自动化技术,以控制计算机生成的故事的情节。首先利用近端策略优化来微调现有的基于变换器的语言模型,以生成文本持续,而且是寻求目标。第二种提取来自展开故事的知识图,该故事由策略网络使用,具有图注意选择由语言模型生成的候选继续。我们报告了与故事如何实现给定的目标事件以及与基线和消融相比的一致性和整体故事质量的人类参与者排名的自动化指标报告。
translated by 谷歌翻译
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a;Radford et al., 2018), BERT is designed to pretrain deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be finetuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial taskspecific architecture modifications.BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).
translated by 谷歌翻译