Neural models with an encoder-decoder framework provide a feasible solution to Question Generation (QG). However, after analyzing the model vocabulary we find that current models (both RNN-based and pre-training based) have more than 23\% inflected forms. As a result, the encoder will generate separate embeddings for the inflected forms, leading to a waste of training data and parameters. Even worse, in decoding these models are vulnerable to irrelevant noise and they suffer from high computational costs. In this paper, we propose an approach to enhance the performance of QG by fusing word transformation. Firstly, we identify the inflected forms of words from the input of encoder, and replace them with the root words, letting the encoder pay more attention to the repetitive root words. Secondly, we propose to adapt QG as a combination of the following actions in the encode-decoder framework: generating a question word, copying a word from the source sequence or generating a word transformation type. Such extension can greatly decrease the size of predicted words in the decoder as well as noise. We apply our approach to a typical RNN-based model and \textsc{UniLM} to get the improved versions. We conduct extensive experiments on SQuAD and MS MARCO datasets. The experimental results show that the improved versions can significantly outperform the corresponding baselines in terms of BLEU, ROUGE-L and METEOR as well as time cost.
translated by 谷歌翻译
本文对过去二十年来对自然语言生成(NLG)的研究提供了全面的审查,特别是与数据到文本生成和文本到文本生成深度学习方法有关,以及NLG的新应用技术。该调查旨在(a)给出关于NLG核心任务的最新综合,以及该领域采用的建筑;(b)详细介绍各种NLG任务和数据集,并提请注意NLG评估中的挑战,专注于不同的评估方法及其关系;(c)强调一些未来的强调和相对近期的研究问题,因为NLG和其他人工智能领域的协同作用而增加,例如计算机视觉,文本和计算创造力。
translated by 谷歌翻译
如今,预先训练的语言模型对于问题产生(QG)任务取得了巨大成功,并明显超过传统的顺序到序列方法。但是,预训练的模型将输入段视为平坦序列,因此不了解输入段的文本结构。对于QG任务,我们将文本结构建模为答案位置和句法依赖性,并提出答案局部性建模和句法掩盖的注意,以解决这些局限性。特别是,我们以高斯偏见为局部建模,以使模型能够专注于答案的上下文,并提出一种掩盖注意机制,以使输入段落的句法结构在问题生成过程中访问。在小队数据集上进行的实验表明,我们提出的两个模块改善了强大的预训练模型ProPHETNET的性能,并将它们梳理在一起,可以通过最先进的预培训模型来实现非常有竞争力的结果。
translated by 谷歌翻译
This paper presents a new UNIfied pre-trained Language Model (UNILM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The unified modeling is achieved by employing a shared Transformer network and utilizing specific self-attention masks to control what context the prediction conditions on. UNILM compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks. Moreover, UNILM achieves new state-ofthe-art results on five natural language generation datasets, including improving the CNN/DailyMail abstractive summarization ROUGE-L to 40.51 (2.04 absolute improvement), the Gigaword abstractive summarization ROUGE-L to 35.75 (0.86 absolute improvement), the CoQA generative question answering F1 score to 82.5 (37.1 absolute improvement), the SQuAD question generation BLEU-4 to 22.12 (3.75 absolute improvement), and the DSTC7 document-grounded dialog response generation NIST-4 to 2.67 (human performance is 2.65). The code and pre-trained models are available at https://github.com/microsoft/unilm. * Equal contribution. † Contact person.
translated by 谷歌翻译
In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora. We propose several novel models that address critical problems in summarization that are not adequately modeled by the basic architecture, such as modeling key-words, capturing the hierarchy of sentence-toword structure, and emitting words that are rare or unseen at training time. Our work shows that many of our proposed models contribute to further improvement in performance. We also propose a new dataset consisting of multi-sentence summaries, and establish performance benchmarks for further research.
translated by 谷歌翻译
复制机制允许序列到序列模型从输入中选择单词并将它们直接放入输出中,这在抽象总结中发现越来越多的使用。但是,由于汉语句子中没有明确的分隔符,所以最现有的中国抽象摘要模型只能执行字符副本,从而导致效率低下。为了解决这个问题,我们提出了一个词典约束的复制网络,在编码器和解码器中模拟多粒度。在源端,单词和字符使用变换器基编码器聚合到相同的输入存储器中。在目标方面,解码器可以在每个时间步骤复制字符或多字符字,并且解码过程由一个词增强的搜索算法引导,其促进并行计算并鼓励模型复制更多单词。此外,我们采用单词选择器来集成关键字信息。实验结果在中国社交媒体数据集显示我们的模型可以独立或使用单词选择器。这两种形式都可以胜过以前的基于角色的模型并实现竞争性表现。
translated by 谷歌翻译
尽管变形金刚在段落的生成中取得了重大成功,但它们将句子视为令牌的线性序列,并且经常忽略其层次结构信息。先前的工作表明,输入令牌分解粒度〜(例如,单词,短语或句子)的水平已产生实质性改进,这表明可以通过更细粒度的粒度建模来增强变形金刚。在这项工作中,我们提出了粒度生成(C-DNPG)的粒度连续分解。为了有效地将粒度纳入编码句子中,C-DNPG引入了一种粒度感知的注意力(GA-注意)机制,该机制扩展了多头自我注意力,以:1)自动渗透句子的粒度头,该机制自动渗透了句子的等级结构通过神经估计每个输入令牌的粒度水平; 2)两个新的注意力面膜,即粒度共振和粒度范围,以有效地将粒度编码为注意力。在两个基准测试的实验(包括Quora问题对和Twitter URL)上表明,C-DNPG的表现优于基线模型,而在许多指标方面,C-DNPG的基线模型优于基线模型。定性分析表明,C-DNPG确实具有有效性捕获细粒度的粒度水平。
translated by 谷歌翻译
谈话问题应答需要能够正确解释问题。然而,由于在日常谈话中难以理解共同参考和省略号的难度,目前的模型仍然不令人满意。尽管生成方法取得了显着的进展,但它们仍然被语义不完整陷入困境。本文提出了一种基于动作的方法来恢复问题的完整表达。具体地,我们首先在将相应的动作分配给每个候选跨度的同时定位问题中的共同引用或省略号的位置。然后,我们寻找与对话环境中的候选线索相关的匹配短语。最后,根据预测的操作,我们决定是否用匹配的信息替换共同参考或补充省略号。我们展示了我们对英语和中文发言权重写任务的方法的有效性,在RESTORATION-200K数据集中分别在3.9 \%和Rouge-L中提高了最先进的EM(完全匹配)。
translated by 谷歌翻译
The rapid advancement of AI technology has made text generation tools like GPT-3 and ChatGPT increasingly accessible, scalable, and effective. This can pose serious threat to the credibility of various forms of media if these technologies are used for plagiarism, including scientific literature and news sources. Despite the development of automated methods for paraphrase identification, detecting this type of plagiarism remains a challenge due to the disparate nature of the datasets on which these methods are trained. In this study, we review traditional and current approaches to paraphrase identification and propose a refined typology of paraphrases. We also investigate how this typology is represented in popular datasets and how under-representation of certain types of paraphrases impacts detection capabilities. Finally, we outline new directions for future research and datasets in the pursuit of more effective paraphrase detection using AI.
translated by 谷歌翻译
问题答案(QA)是自然语言处理中最具挑战性的最具挑战性的问题之一(NLP)。问答(QA)系统试图为给定问题产生答案。这些答案可以从非结构化或结构化文本生成。因此,QA被认为是可以用于评估文本了解系统的重要研究区域。大量的QA研究致力于英语语言,调查最先进的技术和实现最先进的结果。然而,由于阿拉伯QA中的研究努力和缺乏大型基准数据集,在阿拉伯语问答进展中的研究努力得到了很大速度的速度。最近许多预先接受的语言模型在许多阿拉伯语NLP问题中提供了高性能。在这项工作中,我们使用四个阅读理解数据集来评估阿拉伯QA的最先进的接种变压器模型,它是阿拉伯语 - 队,ArcD,AQAD和TYDIQA-GoldP数据集。我们微调并比较了Arabertv2基础模型,ArabertV0.2大型型号和ARAElectra模型的性能。在最后,我们提供了一个分析,了解和解释某些型号获得的低绩效结果。
translated by 谷歌翻译
上下文:堆栈溢出对于寻求编程问题答案的软件开发人员非常有帮助。先前的研究表明,越来越多的问题质量低,因此从潜在的答案者那里获得了更少的关注。 Gao等。提出了一个基于LSTM的模型(即BilstM-CC),以自动从代码片段中生成问题标题,以提高问题质量。但是,只有在问题主体中使用代码段无法为标题生成提供足够的信息,而LSTMS无法捕获令牌之间的远程依赖性。目的:本文提出了基于深度学习的新型模型CCBERT,旨在通过充分利用整个问题主体的双模式信息来增强问题标题生成的性能。方法:CCBERT遵循编码器范式范式,并使用Codebert将问题主体编码为隐藏的表示形式,堆叠的变压器解码器以生成预测的代币,以及附加的复制注意层来完善输出分布。编码器和解码器都执行多头自我注意操作,以更好地捕获远程依赖性。本文构建了一个数据集,该数据集包含大约200,000个高质量问题,该数据从Stack Overflow正式发布的数据中滤除,以验证CCBERT模型的有效性。结果:CCBERT优于数据集上的所有基线模型。对仅代码和低资源数据集进行的实验表明,CCBERT的优势性能较小。人类评估还显示了CCBERT关于可读性和相关标准的出色表现。
translated by 谷歌翻译
代码摘要可帮助开发人员理解程序并减少在软件维护过程中推断程序功能的时间。最近的努力诉诸深度学习技术,例如序列到序列模型,以生成准确的代码摘要,其中基于变压器的方法已实现了有希望的性能。但是,在此任务域中,有效地将代码结构信息集成到变压器中的情况不足。在本文中,我们提出了一种名为SG-Trans的新方法,将代码结构属性纳入变压器。具体而言,我们将局部符号信息(例如,代码令牌和语句)和全局句法结构(例如,数据流程图)注入变压器的自我发项模块中。为了进一步捕获代码的层次结构特征,局部信息和全局结构旨在分布在下层和变压器高层的注意力头中。广泛的评估表明,SG-trans的表现优于最先进的方法。与表现最佳的基线相比,SG-Trans在流星评分方面仍然可以提高1.4%和2.0%,这是一个广泛用于测量发电质量的度量,分别在两个基准数据集上。
translated by 谷歌翻译
Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document. Conventional methods normally apply an encoder-decoder architecture to generate the output keyphrases for an input document, where they are designed to focus on each current document so they inevitably omit crucial corpus-level information carried by other similar documents, i.e., the cross-document dependency and latent topics. In this paper, we propose CDKGen, a Transformer-based keyphrase generator, which expands the Transformer to global attention with cross-document attention networks to incorporate available documents as references so as to generate better keyphrases with the guidance of topic information. On top of the proposed Transformer + cross-document attention architecture, we also adopt a copy mechanism to enhance our model via selecting appropriate words from documents to deal with out-of-vocabulary words in keyphrases. Experiment results on five benchmark datasets illustrate the validity and effectiveness of our model, which achieves the state-of-the-art performance on all datasets. Further analyses confirm that the proposed model is able to generate keyphrases consistent with references while keeping sufficient diversity. The code of CDKGen is available at https://github.com/SVAIGBA/CDKGen.
translated by 谷歌翻译
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
translated by 谷歌翻译
访问公共知识库中可用的大量信息可能对那些不熟悉的SPARQL查询语言的用户可能很复杂。SPARQL中自然语言提出的问题的自动翻译有可能克服这个问题。基于神经机翻译的现有系统非常有效,但在识别出识别出训练集的词汇(OOV)的单词中很容易失败。查询大型本体的时,这是一个严重的问题。在本文中,我们将命名实体链接,命名实体识别和神经计算机翻译相结合,以将自然语言问题的自动转换为SPARQL查询。我们凭经验证明,我们的方法比在纪念碑,QALD-9和LC-QUAD V1上运行实验,我们的方法比现有方法更有效,并且对OOV单词进行了更有效的,并且是现有的方法,这些方法是众所周知的DBPedia的相关数据集。
translated by 谷歌翻译
自动在自然语言中自动生成图像的描述称为图像字幕。这是一个积极的研究主题,位于人工智能,计算机视觉和自然语言处理中两个主要领域的交集。图像字幕是图像理解中的重要挑战之一,因为它不仅需要识别图像中的显着对象,还需要其属性及其相互作用的方式。然后,系统必须生成句法和语义上正确的标题,该标题描述了自然语言的图像内容。鉴于深度学习模型的重大进展及其有效编码大量图像并生成正确句子的能力,最近已经提出了几种基于神经的字幕方法,每种方法都试图达到更好的准确性和标题质量。本文介绍了一个基于编码器的图像字幕系统,其中编码器使用以RESNET-101作为骨干为骨干来提取图像中每个区域的空间和全局特征。此阶段之后是一个精致的模型,该模型使用注意力进行注意的机制来提取目标图像对象的视觉特征,然后确定其相互作用。解码器由一个基于注意力的复发模块和一个反思性注意模块组成,该模块会协作地将注意力应用于视觉和文本特征,以增强解码器对长期顺序依赖性建模的能力。在两个基准数据集(MSCOCO和FLICKR30K)上进行的广泛实验显示了提出的方法和生成的字幕的高质量。
translated by 谷歌翻译
We introduce an approach for the answer-aware question generation problem. Instead of only relying on the capability of strong pre-trained language models, we observe that the information of answers and questions can be found in some relevant sentences in the context. Based on that, we design a model which includes two modules: a selector and a generator. The selector forces the model to more focus on relevant sentences regarding an answer to provide implicit local information. The generator generates questions by implicitly combining local information from the selector and global information from the whole context encoded by the encoder. The model is trained jointly to take advantage of latent interactions between the two modules. Experimental results on two benchmark datasets show that our model is better than strong pre-trained models for the question generation task. The code is also available (shorturl.at/lV567).
translated by 谷歌翻译
审议是人类日常生活中的一种共同自然行为。例如,在撰写论文或文章时,我们通常会首先编写草稿,然后迭代地擦亮它们,直到满足为止。鉴于这种人类的认知过程,我们提出了Decom,这是自动评论生成的多通审议框架。 DECOM由多个审议模型和一个评估模型组成。给定代码段,我们首先从代码中提取关键字,然后从预定义的语料库中检索类似的代码片段。然后,我们将检索到的代码的评论视为初始草案,并将其用代码和关键字输入到DETOM中,以开始迭代审议过程。在每次审议时,审议模型都会抛光草案并产生新的评论。评估模型衡量了新生成的评论的质量,以确定是否结束迭代过程。终止迭代过程后,将选择最佳的评论作为目标评论。我们的方法在Java(87K)和Python(108K)的两个现实世界数据集上进行了评估,实验结果表明,我们的方法表现优于最先进的基准。人类评估研究还证实,DECOM产生的评论往往更可读性,信息性和有用。
translated by 谷歌翻译
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a;Radford et al., 2018), BERT is designed to pretrain deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be finetuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial taskspecific architecture modifications.BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).
translated by 谷歌翻译
具有复制机制的最近神经序列到序列模型在各种文本生成任务中取得了显着的进展。这些模型解决了词汇问题,并促进了稀有词的产生。然而,如先前的复制模型所观察到的,难以产生的,难以产生和缺乏抽象,难以识别。在本文中,我们提出了一种副本网络的新颖监督方法,该方法可帮助模型决定需要复制哪些单词并需要生成。具体而言,我们重新定义目标函数,它利用源序列和目标词汇表作为复制的指导。关于数据到文本生成和抽象总结任务的实验结果验证了我们的方法提高了复制质量,提高了抽象程度。
translated by 谷歌翻译