结合PersonAs信息允许在对话响应生成中多样化和接触响应。不幸的是,事先作品主要专注于自我的人物,并忽视了合作伙伴角色的价值。此外,在实际应用中,实际伙伴角色的可用性通常不是这种情况。本文试图通过提供一种新颖的框架来解决这些问题,这些框架利用自动合作伙伴角色生成来增强成功的对话一代。我们将强化学习纳入了一个专门设计的批评网络,以获得奖励判断。自动和人类评估的实验结果表明a)我们的框架能够产生相关,信息丰富的合作伙伴角色,甚至与地面真理合作伙伴角色相比。 b)生成的合作伙伴角色增强了后续的响应生成,从而超越了当在推理阶段缺少合作伙伴角色时超越了我们的基线和比较模型。 c)我们的框架在推理期间产生的响应比我们的基线在地面真理合作伙伴角色上的基线更具信息丰富和参与。 d)我们专门设计的批评批评网络有效地加强了我们的框架。最后,我们的框架提供了更好的解释性,并降低了对伙伴角色的外部数据库的需求。
translated by 谷歌翻译
个性化对话代理对于对话系统非常重要,以产生更具体,一致,并从事和吸引力的反应。然而,大多数当前对话的个性化方法依赖于推理期间的明确人物描述,严重限制其应用。在本文中,我们提出了一种新颖的方法,该方法将根据对话历史来预测人物信息,以个性化对话代理而不依赖于推理期间的任何明确的人格描述。 Personachat数据集上的实验结果表明,当在对话剂的预测轮廓上调节(即“自身角色”)时,所提出的方法可以提高所产生的响应的一致性,并在预测的角色调节时改善所产生的响应的接合对话伙伴(即“他们的角色”)。我们还发现培训的角色预测模型可以成功转移到其他数据集,并帮助生成更相关的响应。
translated by 谷歌翻译
以一致的性格赋予聊天机器人对于代理商提供类似人类互动的作用至关重要。但是,现有的个性化方法通常会根据用文本描述描绘的静态预定义角色产生响应,这可能严重限制了人类和聊天机器人的互动性,尤其是当代理人需要回答预定义角色中排除的查询时,这是如此 - 被称为预先定义的角色问题(以简单性为OOP)。为了减轻问题,在本文中,我们提出了一个新颖的检索到预测范式,该范式由两个子组件组成,即(1)角色检索模型(PRM),它根据自然语言推论从全球收藏中检索角色( NLI)模型,推断的角色与预定义的角色一致; (2)后验变压器(PS-Transformer)采用角色后部分布,进一步考虑了地面响应中使用的实际角色,从而最大程度地减轻了训练和推断之间的差距。此外,我们提出了一个名为IT-Convai2的数据集,该数据集首先突出了个性化对话中的OOP问题。对IT-Convai2和Convai2的广泛实验表明,我们提出的模型在自动指标和人类评估方面都有显着改善。
translated by 谷歌翻译
当前的个性化对话中的作品主要有助于代理人表现出一致的个性并推动更有用的回应。但是,我们发现大多数以前模型的生成的响应往往是以自我为中心的,对话中的用户几乎不关心。此外,我们认为类似人类的对话基本上是基于推断另一方角色的信息而构建的。由此激励,我们通过检测隐性用户角色提出了一种新颖的个性化对话生成器。因为很难为每个用户收集大量详细的角色,所以我们试图对用户的潜在角色及其从对话历史记录进行建模,而没有外部知识。使用条件变异推断对感知和推子变量进行了构想。两个潜在变量模拟了人们意识到彼此角色并在对话中产生相应表达的过程。最后,提出了后歧视的正规化以增强训练程序。实证研究表明,与最先进的方法相比,我们的方法更关心用户的角色,并在整个评估中实现了相当大的推动力。
translated by 谷歌翻译
虽然通常可以使用丰富的开放域文本数据,并且可能包括有趣的现象(幽默,讽刺,移情等),大多数是用于语言处理任务的设计,并且通常采用非交流格式。在这项工作中,我们朝着使用生成的对话网络自动生成对话数据迈出了一步,旨在从可用的语言和知识数据的广度中受益,并培训开放式域社交对话代理。我们使用自动指标和人类评估符在主题聊天数据集上有或没有知识的对话中评估我们的方法。我们的结果表明,对于没有知识基础的对话,GCN可以从种子数据中概括,产生新颖的对话,这些对话较小,但更具吸引力,并且对于知识的对话,它可以产生更多以知识为中心,流利和引人入胜的对话。具体而言,我们表明,对于使用10 \%种子数据的开放域对话,我们的方法靠近使用100%数据的基线,而对于知识接地的对话,它仅使用1%数据,关于人类参与性,流利性和相关性的评级。
translated by 谷歌翻译
Natural Language Generation (NLG) represents a large collection of tasks in the field of NLP. While many of these tasks have been tackled well by the cross-entropy (CE) loss, the task of dialog generation poses a few unique challenges for this loss function. First, CE loss assumes that for any given input, the only possible output is the one available as the ground truth in the training dataset. In general, this is not true for any task, as there can be multiple semantically equivalent sentences, each with a different surface form. This problem gets exaggerated further for the dialog generation task, as there can be multiple valid responses (for a given context) that not only have different surface forms but are also not semantically equivalent. Second, CE loss does not take the context into consideration while processing the response and, hence, it treats all ground truths with equal importance irrespective of the context. But, we may want our final agent to avoid certain classes of responses (e.g. bland, non-informative or biased responses) and give relatively higher weightage for more context-specific responses. To circumvent these shortcomings of the CE loss, in this paper, we propose a novel loss function, CORAL, that directly optimizes recently proposed estimates of human preference for generated responses. Using CORAL, we can train dialog generation models without assuming non-existence of response other than the ground-truth. Also, the CORAL loss is computed based on both the context and the response. Extensive comparisons on two benchmark datasets show that the proposed methods outperform strong state-of-the-art baseline models of different sizes.
translated by 谷歌翻译
许多文献表明,基于及时的学习是使用大型预训练的语言模型的有效方法。最近的作品还展示了通过插入适当的提示来指导聊天机器人输出的可能性。基于梯度的方法通常用于扰动提示。但是,某些语言模型甚至无法为公众提供。在这项工作中,我们首先探讨了提示和加强学习(RL)与转向模型的生成的组合,而无需访问任何模型的参数。其次,为了减少培训工作并增强对看不见的任务的普遍性,我们应用多任务学习以使模型学会更好地对新任务进行推广。实验结果表明,我们提出的方法可以成功控制几个最新的(SOTA)对话模型,而无需访问其参数。此外,该模型证明了与基线模型更少的步骤快速适应看不见的任务的强大能力。
translated by 谷歌翻译
Recent advances in large-scale pre-training provide large models with the potential to learn knowledge from the raw text. It is thus natural to ask whether it is possible to leverage these large models as knowledge bases for downstream tasks. In this work, we answer the aforementioned question in unsupervised knowledge-grounded conversation. We explore various methods that best elicit knowledge from large models. Our human study indicates that, though hallucinations exist, large models post the unique advantage of being able to output common sense and summarize facts that cannot be directly retrieved from the search engine. To better exploit such generated knowledge in dialogue generation, we treat the generated knowledge as a noisy knowledge source and propose the posterior-based reweighing as well as the noisy training strategy. Empirical results on two benchmarks show advantages over the state-of-the-art methods.
translated by 谷歌翻译
在口头对话系统中,我们的目标是部署人工智能,以建立可以与人类交流的自动化对话剂。对话系统越来越多地旨在超越仅仅模仿对话,而且随着时间的推移,这些交互也会改善。在本次调查中,我们概述了多年来制定对话系统的方法的广泛概述。对话系统的不同用例范围从基于任务的系统到开放域聊天动机和需要特定的系统。从简单的规则的系统开始,研究已经朝着越来越复杂的建筑培训,这些建筑在大规模的数据集语料库中培训,如深度学习系统。激进了类似人类对话的直觉,通过加强学习将情绪纳入自然语言发生器的进展。虽然我们看到对某些指标的高度边际改善的趋势,但我们发现指标存在有限的理由,评估实践并不统一。要得出结论,我们标志着这些问题并突出了可能的研究方向。
translated by 谷歌翻译
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before. In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions; and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, machine translation, and visual-language generation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.
translated by 谷歌翻译
Nowadays, the current neural network models of dialogue generation(chatbots) show great promise for generating answers for chatty agents. But they are short-sighted in that they predict utterances one at a time while disregarding their impact on future outcomes. Modelling a dialogue's future direction is critical for generating coherent, interesting dialogues, a need that has led traditional NLP dialogue models that rely on reinforcement learning. In this article, we explain how to combine these objectives by using deep reinforcement learning to predict future rewards in chatbot dialogue. The model simulates conversations between two virtual agents, with policy gradient methods used to reward sequences that exhibit three useful conversational characteristics: the flow of informality, coherence, and simplicity of response (related to forward-looking function). We assess our model based on its diversity, length, and complexity with regard to humans. In dialogue simulation, evaluations demonstrated that the proposed model generates more interactive responses and encourages a more sustained successful conversation. This work commemorates a preliminary step toward developing a neural conversational model based on the long-term success of dialogues.
translated by 谷歌翻译
缺乏外部知识使同志对话系统难以察觉隐含的情绪,并从有限的对话历史上学习情绪相互作用。为了解决上述问题,我们建议利用外部知识,包括致命知识和情绪词汇知识,以明确了解和表达在同情对话中的情绪。我们首先通过与外部知识共同互动并构建情感语境图来丰富对话史。然后,我们从知识丰富的情绪上下文图和蒸馏情绪信号中学习情绪背景陈述,这是在反应中表达的谓词情绪的先决条件。最后,为了产生同志反应,我们提出了一种情绪跨关注机制来从情绪上下文图中学习情绪依赖。在基准数据集上进行的广泛实验验证了该方法的有效性。此外,我们发现通过与正交工作的预先训练的模型集成,可以进一步提高我们的方法的性能。
translated by 谷歌翻译
Personalized chatbots focus on endowing the chatbots with a consistent personality to behave like real users and further act as personal assistants. Previous studies have explored generating implicit user profiles from the user's dialogue history for building personalized chatbots. However, these studies only use the response generation loss to train the entire model, thus it is prone to suffer from the problem of data sparsity. Besides, they overemphasize the final generated response's quality while ignoring the correlations and fusions between the user's dialogue history, leading to rough data representations and performance degradation. To tackle these problems, we propose a self-supervised learning framework MCP for capturing better representations from users' dialogue history for personalized chatbots. Specifically, we apply contrastive sampling methods to leverage the supervised signals hidden in user dialog history, and generate the pre-training samples for enhancing the model. We design three pre-training tasks based on three types of contrastive pairs from user dialogue history, namely response pairs, sequence augmentation pairs, and user pairs. We pre-train the utterance encoder and the history encoder towards the contrastive objectives and use these pre-trained encoders for generating user profiles while personalized response generation. Experimental results on two real-world datasets show a significant improvement in our proposed model MCP compared with the existing methods.
translated by 谷歌翻译
预培训语言模型的浪潮一直不断提高机器生成的对话的质量,然而,一些产生的响应仍然遭受过度重复,有时重复从话语中重复单词,有时重复自我产生的响应中的单词,或者两个都。不当重复单词可以显着降低生成文本的质量。受到惩罚的采样是一种流行的解决方案,减少了推理期间现有词的采样概率,但是,它非常容易受到静态的不适当的设置。将其设置得太高可以产生奇怪和不切实际的句子,同时将其设置得太低,使得抑制重复微不足道的任务。要解决上述方法的缺点,我们设计了一个上下文感知的分类器,以明确决定何时允许重复和何时采用惩罚的采样。这种分类器可以容易地与现有的解码方法集成,在保持文本的分集的同时在适当的情况下减少重复。实验结果表明,我们的方法可以产生更高质量和更真实的对话。
translated by 谷歌翻译
Pre-trained language models (LMs) store knowledge in their parameters and can generate informative responses when used in conversational systems. However, LMs suffer from the problem of "hallucination:" they may generate plausible-looking statements that are irrelevant or factually incorrect. To address this problem, we propose a contrastive learning scheme, named MixCL. A novel mixed contrastive objective is proposed to explicitly optimize the implicit knowledge elicitation process of LMs, and thus reduce their hallucination in conversations. We also examine negative sampling strategies of retrieved hard negatives and model-generated negatives. We conduct experiments on Wizard-of-Wikipedia, a public, open-domain knowledge-grounded dialogue benchmark, and assess the effectiveness of MixCL. MixCL effectively reduces the hallucination of LMs in conversations and achieves the highest performance among LM-based dialogue agents in terms of relevancy and factuality. We show that MixCL achieves comparable performance to state-of-the-art KB-based approaches while enjoying notable advantages in terms of efficiency and scalability.
translated by 谷歌翻译
个性化对话的产生对于自然和人类的谈话至关重要。通常,个性化对话生成模型涉及对对话历史的生成响应以及对话者的角色/人格的表示。由于获得每个对话者的人格/人格表征是不切实际的,最近的作品已经探讨了通过将模型与对应于给定的人格对应的对话示例的模型来产生个性化对话的可能性。然而,在实际实现中,足够数量的相应对话示例也很少可用。因此,在本文中,我们提出了一种能够在没有任何人格/人格信息或任何相应的对话示例的情况下产生个性化对话的双潜变量发生器(DLVGEN)。与现有工作不同,DLVGEN模拟潜在响应的潜在分布以及代理商的潜在角色的潜在分布。在推理期间,从两个分布中采样潜在变量并进入解码器。经验结果表明,DLVGEN能够产生各种反应,精确地纳入代理人的角色。
translated by 谷歌翻译
We have a Christmas gift for Harry Potter fans all over the world. In this paper, we present Harry Potter Dialogue (HPD), a dataset that helps train Harry Potter-like dialogue agents. Such a task is typically viewed as a variant of personalized dialogue agents, but they differ significantly in three respects: 1) Harry lived in a virtual world of wizards, thus, real-world commonsense may not apply to Harry's conversations; 2) Harry's behavior is strongly linked to background information in conversations: the scene, its attributes and its relationship to other speakers; and 3) Such backgrounds are dynamically altered as the storyline goes on. The HPD dataset, as the first dataset to facilitate the study of dialogue agent construction for characters within a story, provides rich contextual information about each dialogue session such as scenes, character attributes, and relations. More importantly, all the background information will change over the course of the story. In addition, HPD could support both dialogue generation and retrieval tasks. We evaluate baselines such as Dialog-GPT and BOB to determine the extent to which they can generate Harry Potter-like responses. The experimental results disappoint us in that although the generated responses are fluent, they still seem out of character for Harry. Besides, we validate the current most robust dialogue agent, ChatGPT, which also can't generate plausible Harry-Potter-like responses in some cases, either. Our results suggest that there is much scope for future research.
translated by 谷歌翻译
We present a large, tunable neural conversational response generation model, DIALOGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent opendomain dialogue systems.
translated by 谷歌翻译
面向任务的对话系统旨在通过自然语言互动实现用户目标。他们可以与人类用户一起评估它们,但是在开发阶段的每个迭代中都无法实现。模拟用户可能是替代方案,但是他们的开发是不平凡的。因此,研究人员诉诸于现有的人类语料库的离线指标,这些指标更实用且易于再现。不幸的是,它们在反映对话系统的真实性能方面受到限制。例如,BLEU与人类判断力的相关性很差,现有的基于语料库的指标(例如成功率忽略对话环境不匹配)。对于具有良好概括且与人类判断密切相关的任务导向系统,仍然需要一个可靠的指标。在本文中,我们建议使用离线增强学习来基于静态语料库的对话评估。这样的评估者通常称为评论家,并用于政策优化。我们迈出了一步,并表明可以在任何对话系统的静态语料库上对离线RL批评家作为外部评估者进行培训,从而可以在各种类型的系统上进行对话性能比较。这种方法的好处是与人类判断达到密切的相关性,使其成为与模型无关的,我们通过交互式用户试验确认。
translated by 谷歌翻译
通过社交媒体评论预先培训的许多开放域对话模型都可以产生连贯的答复,但在与真实用户互动时会产生引人入胜的答复。这种现象可能主要是由于注释的人类对话的不足以及与人类偏爱的未对准。在本文中,我们提出了一种新颖而有效的方法,以增强开放域聊天机器人,其中有两种人类反馈(包括明确的演示和隐性偏好),并利用了。通过要求注释者选择或修改模型生成的候选响应,Diamante有效地收集了人类证明的响应并构建了中国聊天数据集。为了增强与人类偏好的一致性,Diamante利用数据收集过程中的隐含偏好,并引入了生成评估联合培训。全面的实验表明,Diamante数据集和联合培训范式可以显着提高中国预训练的对话模型的性能。
translated by 谷歌翻译