We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. We collect a large dataset of 300K human-written stories paired with writing prompts from an online forum. Our dataset enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text. We gain further improvements with a novel form of model fusion that improves the relevance of the story to the prompt, and adding a new gated multi-scale self-attention mechanism to model long-range context. Experiments show large improvements over strong baselines on both automated and human evaluations. Human judges prefer stories generated by our approach to those from a strong non-hierarchical model by a factor of two to one.
translated by 谷歌翻译
Recent advances in deep learning research, such as transformers, have bolstered the ability for automated agents to generate creative texts similar to those that a human would write. By default, transformer decoders can only generate new text with respect to previously generated text. The output distribution of candidate tokens at any position is conditioned on previously selected tokens using a self-attention mechanism to emulate the property of autoregression. This is inherently limiting for tasks such as controllable story generation where it may be necessary to condition on future plot events when writing a story. In this work, we propose Future Sight, a method for finetuning a pretrained generative transformer on the task of future conditioning. Transformer decoders are typically pretrained on the task of completing a context, one token at a time, by means of self-attention. Future Sight additionally enables a decoder to attend to an encoded future plot event. This motivates the decoder to expand on the context in a way that logically concludes with the provided future. During inference, the future plot event can be written by a human author to steer the narrative being generated in a certain direction. We evaluate the efficacy of our approach on a story generation task with human evaluators.
translated by 谷歌翻译
长篇小说(LSG)是自然语言处理中的垂灾目标之一。与大多数文本生成任务不同,LSG要求基于更短的文本输入输出丰富的内容的长话,并且通常存在信息稀疏性。在本文中,我们提出了\ emph {topnet}来缓解这个问题,通过利用神经主题建模的最新进步来获得高质量的骨架词来补充短输入。特别是,而不是直接生成故事,首先学会将简短的文本输入映射到低维主题分布(由主题模型预先分配)。基于此潜在主题分布,我们可以使用主题模型的重建解码器来对与故事的骨骼相互相关的单词序列。两个基准数据集的实验表明,我们的框架在骨架词选择中非常有效,在自动评估和人类评估中显着优于最先进的模型。
translated by 谷歌翻译
预训练的语言模型(PLM)无法生成长形式的叙事文本,因为它们不考虑全局结构。结果,生成的文本通常是不巧妙的,重复的或缺乏内容的。故事发电的最新工作以提示,关键字或语义框架的形式重新引入了明确的内容计划。经过大型平行语料库的培训,这些模型可以生成更合乎逻辑的事件序列,从而产生更满足的故事。但是,这些中间表示通常不使用自然语言,并且不需要微调就无法使用。我们建议使用现成的PLM生成故事情节,同时保持内容计划的好处,以产生凝聚力和满足的故事。我们提出的方法ScratchPlot首先提示PLM构成内容计划。然后,我们生成故事的身体并以内容计划结束。此外,我们通过使用其他PLM来对生成的(故事,结尾)对进行排名。我们用各种基线基准测试我们的方法,并在人类和自动评估中取得了卓越的结果。
translated by 谷歌翻译
本文对过去二十年来对自然语言生成(NLG)的研究提供了全面的审查,特别是与数据到文本生成和文本到文本生成深度学习方法有关,以及NLG的新应用技术。该调查旨在(a)给出关于NLG核心任务的最新综合,以及该领域采用的建筑;(b)详细介绍各种NLG任务和数据集,并提请注意NLG评估中的挑战,专注于不同的评估方法及其关系;(c)强调一些未来的强调和相对近期的研究问题,因为NLG和其他人工智能领域的协同作用而增加,例如计算机视觉,文本和计算创造力。
translated by 谷歌翻译
Controllable Text Generation (CTG) is emerging area in the field of natural language generation (NLG). It is regarded as crucial for the development of advanced text generation technologies that are more natural and better meet the specific constraints in practical applications. In recent years, methods using large-scale pre-trained language models (PLMs), in particular the widely used transformer-based PLMs, have become a new paradigm of NLG, allowing generation of more diverse and fluent text. However, due to the lower level of interpretability of deep neural networks, the controllability of these methods need to be guaranteed. To this end, controllable text generation using transformer-based PLMs has become a rapidly growing yet challenging new research hotspot. A diverse range of approaches have emerged in the recent 3-4 years, targeting different CTG tasks which may require different types of controlled constraints. In this paper, we present a systematic critical review on the common tasks, main approaches and evaluation methods in this area. Finally, we discuss the challenges that the field is facing, and put forward various promising future directions. To the best of our knowledge, this is the first survey paper to summarize CTG techniques from the perspective of PLMs. We hope it can help researchers in related fields to quickly track the academic frontier, providing them with a landscape of the area and a roadmap for future research.
translated by 谷歌翻译
We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach. The idea is to create a short, one-sentence news summary answering the question "What is the article about?". We collect a real-world, large scale dataset for this task by harvesting online articles from the British Broadcasting Corporation (BBC). We propose a novel abstractive model which is conditioned on the article's topics and based entirely on convolutional neural networks. We demonstrate experimentally that this architecture captures longrange dependencies in a document and recognizes pertinent content, outperforming an oracle extractive system and state-of-the-art abstractive approaches when evaluated automatically and by humans. 1
translated by 谷歌翻译
深度神经语言模型的最新进展与大规模数据集的能力相结合,加速了自然语言生成系统的发展,这些系统在多种任务和应用程序上下文中产生流利和连贯的文本(在各种成功程度上)。但是,为所需的用户控制这些模型的输出仍然是一个开放的挑战。这不仅对于自定义生成语言的内容和样式至关重要,而且对于他们在现实世界中的安全可靠部署至关重要。我们提出了一项关于受约束神经语言生成的新兴主题的广泛调查,在该主题中,我们通过区分条件和约束(后者是在输出文本上而不是输入的可检验条件),正式定义和分类自然语言生成问题,目前是可检验的)约束文本生成任务,并查看受限文本生成的现有方法和评估指标。我们的目的是强调这个新兴领域的最新进展和趋势,以告知最有希望的方向和局限性,以推动受约束神经语言生成研究的最新作品。
translated by 谷歌翻译
当前的语言模型达到了较低的困惑,但其产生的几代人仍然遭受有毒的反应,重复性和矛盾。标准语言建模设置无法解决这些问题。在本文中,我们介绍了一个新的体系结构{\ sc导演},由一个统一的生成器分类器组成,具有语言建模和每个输出令牌的分类头。培训是使用标准语言建模数据共同进行的,并以所需和不良序列标记的数据。与标准语言模型相比,该模型在多种设置中的实验表明,该模型具有竞争性的培训和解码速度,同时产生了较高的结果,从而减轻了已知的问题,同时保持发电质量。就准确性和效率而言,它还优于现有的模型指导方法。
translated by 谷歌翻译
连接视觉和语言在生成智能中起着重要作用。因此,已经致力于图像标题的大型研究工作,即用句法和语义有意义的句子描述图像。从2015年开始,该任务通常通过由Visual Encoder组成的管道和文本生成的语言模型来解决任务。在这些年来,两种组件通过对象区域,属性,介绍多模态连接,完全关注方法和伯特早期融合策略的利用而显着发展。但是,无论令人印象深刻的结果,图像标题的研究还没有达到结论性答案。这项工作旨在提供图像标题方法的全面概述,从视觉编码和文本生成到培训策略,数据集和评估度量。在这方面,我们量化地比较了许多相关的最先进的方法来确定架构和培训策略中最有影响力的技术创新。此外,讨论了问题的许多变体及其开放挑战。这项工作的最终目标是作为理解现有文献的工具,并突出显示计算机视觉和自然语言处理的研究领域的未来方向可以找到最佳的协同作用。
translated by 谷歌翻译
The pre-dominant approach to language modeling to date is based on recurrent neural networks. Their success on this task is often linked to their ability to capture unbounded context. In this paper we develop a finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens. We propose a novel simplified gating mechanism that outperforms Oord et al. (2016b) and investigate the impact of key architectural decisions. The proposed approach achieves state-of-the-art on the WikiText-103 benchmark, even though it features longterm dependencies, as well as competitive results on the Google Billion Words benchmark. Our model reduces the latency to score a sentence by an order of magnitude compared to a recurrent baseline. To our knowledge, this is the first time a non-recurrent approach is competitive with strong recurrent models on these large scale language tasks.
translated by 谷歌翻译
将文本插入段落中指定位置的任务(称为空白(FITB))对于各种应用程序与作家与自然语言生成(NLG)系统互动以制作文本的应用很有用。虽然先前的工作已经通过专门培训的模型来解决此问题,但更有用的模型是可以有效地执行_both_ fitb和延续的模型。在这项工作中,我们评估了使用单个模型完成这两个任务的可行性。我们表明,通过FITB式目标进行预训练的模型都可以完成这两个任务,而预先训练的持续训练的模型却没有。最后,我们展示了如何轻松地对FITB模型进行填充,以允许对一代的长度和单词选择进行细粒度的控制。
translated by 谷歌翻译
当前的语言模型可以产生高质量的文本。他们只是复制他们之前看到的文本,或者他们学习了普遍的语言抽象吗?要取笑这些可能性,我们介绍了乌鸦,这是一套评估生成文本的新颖性,专注于顺序结构(n-gram)和句法结构。我们将这些分析应用于四种神经语言模型(LSTM,变压器,变换器-XL和GPT-2)。对于本地结构 - 例如,单个依赖性 - 模型生成的文本比来自每个模型的测试集的人类生成文本的基线显着不那么新颖。对于大规模结构 - 例如,总句结构 - 模型生成的文本与人生成的基线一样新颖甚至更新颖,但模型仍然有时复制,在某些情况下,在训练集中重复超过1000字超过1,000字的通道。我们还表现了广泛的手动分析,表明GPT-2的新文本通常在形态学和语法中形成良好,但具有合理的语义问题(例如,是自相矛盾)。
translated by 谷歌翻译
In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora. We propose several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-toword structure, and emitting words that are rare or unseen at training time. Our work shows that many of our proposed models contribute to further improvement in performance. We also propose a new dataset consisting of multi-sentence summaries, and establish performance benchmarks for further research.
translated by 谷歌翻译
我们介绍了块状变压器,该变压器以序列的反复方式应用变压器层,并且相对于序列长度具有线性复杂性。我们的复发单元在训练过程中在代币的块而不是单个令牌上运行,并利用块内并行计算,以便有效利用加速器硬件。单元本身非常简单。它仅仅是一个变压器层:它使用自我注意事项和交叉注意力来有效计算大量状态向量和令牌上的复发函数。我们的设计部分受到LSTM单元的启发,它使用LSTM风格的大门,但它可以将典型的LSTM单元缩放为几个数量级。我们的复发实现在计算时间和参数计数中都具有相同的成本作为传统的变压器层,但是在很长的序列中,语言建模任务中的语言建模任务的困惑极大地改善了。我们的模型比远程变压器XL基线的表现宽大,同时运行的速度是两倍。我们证明了它在PG19(书籍),Arxiv论文和GitHub源代码上的有效性。我们的代码已发布为开​​源。
translated by 谷歌翻译
Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. For instance, while the current state-of-the-art BLEU-1 score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69. We also show BLEU-1 score improvements on Flickr30k, from 56 to 66, and on SBU, from 19 to 28. Lastly, on the newly released COCO dataset, we achieve a BLEU-4 of 27.7, which is the current state-of-the-art.
translated by 谷歌翻译
多文件摘要(MDS)是信息聚合的有效工具,它从与主题相关文档集群生成信息和简洁的摘要。我们的调查是,首先,系统地概述了最近的基于深度学习的MDS模型。我们提出了一种新的分类学,总结神经网络的设计策略,并进行全面的最先进的概要。我们突出了在现有文献中很少讨论的各种客观函数之间的差异。最后,我们提出了与这个新的和令人兴奋的领域有关的几个方向。
translated by 谷歌翻译
基于神经语言模型的自动化故事生成方法遭受了两个重要的限制。首先,基于语言模型的故事生成器通常不适用于给定的目标或结束。其次,当故事变长时,他们经常失去一致。我们提出了一种新的自动化故事生成方法,将问题视为生成的问答之一。我们所提出的故事生成系统从封装故事的最终事件的句子开始。然后系统迭代地(1)分析描述最新事件的文本,(2)生成关于“为什么”一个字符正在执行他们在事件中执行的事情的问题,然后(3)尝试生成另一个前面的回答这个问题的事件。
translated by 谷歌翻译
当前有效的微调方法(例如,适配器,前缀调整等)通过培训一小组神经语言模型的额外参数进行优化的条件文本生成,同时冻结其余效率。虽然在某些一代任务中显示出强大表现,但它们不会概括所有一代任务。在这项工作中,我们表明可以提高基于迅速的条件文本生成,简单而有效的方法模拟了人类书面文本的话语结构建模。我们介绍了两个关键设计选择:首先,我们表明人写文本的更高级别的话语结构可以用前缀参数上的\ Textit {分层阻塞}建模,使得能够跨越输入和输出文本的不同部分,并产生更长度的输出几代人。其次,我们通过在网络上的不同层的前缀参数上引入\ texit {注意稀疏性}来提出稀疏的前缀调整,并分别学习SoftMax函数上的稀疏变换。我们发现稀疏的注意力使前缀调整能够更好地控制输入内容(突出事实),从而更有效地调整前缀参数。在各种文本生成任务上的实验表明,前缀参数的结构化设计可以实现可比的结果,以微调所有参数,同时即使在低资源设置中也表现出所有生成任务的标准前缀调整。
translated by 谷歌翻译
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. 1 Compared to recurrent models, computations over all elements can be fully parallelized during training to better exploit the GPU hardware and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT'14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.
translated by 谷歌翻译