We investigate response generation for multi-turn dialogue in generative-based chatbots. Existing generative models based on RNNs (Recurrent Neural Networks) usually employ the last hidden state to summarize the sequences, which makes models unable to capture the subtle variability observed in different dialogues and cannot distinguish the differences between dialogues that are similar in composition. In this paper, we propose a Pseudo-Variational Gated Recurrent Unit (PVGRU) component without posterior knowledge through introducing a recurrent summarizing variable into the GRU, which can aggregate the accumulated distribution variations of subsequences. PVGRU can perceive the subtle semantic variability through summarizing variables that are optimized by the devised distribution consistency and reconstruction objectives. In addition, we build a Pseudo-Variational Hierarchical Dialogue (PVHD) model based on PVGRU. Experimental results demonstrate that PVGRU can broadly improve the diversity and relevance of responses on two benchmark datasets.
translated by 谷歌翻译
Conditional variational models, using either continuous or discrete latent variables, are powerful for open-domain dialogue response generation. However, previous works show that continuous latent variables tend to reduce the coherence of generated responses. In this paper, we also found that discrete latent variables have difficulty capturing more diverse expressions. To tackle these problems, we combine the merits of both continuous and discrete latent variables and propose a Hybrid Latent Variable (HLV) method. Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables. Thus, we diversify the generated responses while maintaining relevance and coherence. In addition, we propose Conditional Hybrid Variational Transformer (CHVT) to construct and to utilize HLV with transformers for dialogue generation. Through fine-grained symbolic-level semantic information and additive Gaussian mixing, we construct the distribution of continuous variables, prompting the generation of diverse expressions. Meanwhile, to maintain the relevance and coherence, the discrete latent variable is optimized by self-separation training. Experimental results on two dialogue generation datasets (DailyDialog and Opensubtitles) show that CHVT is superior to traditional transformer-based variational mechanism w.r.t. diversity, relevance and coherence metrics. Moreover, we also demonstrate the benefit of applying HLV to fine-tuning two pre-trained dialogue models (PLATO and BART-base).
translated by 谷歌翻译
在零拍摄的情况下建立对话的生成系统仍然是一个巨大的挑战,因为对话生成中典型的零击方法很大程度上取决于大规模的预训练的语言生成模型,例如GPT-3和T5。由于缺乏相应的平行对话COLIDA,对无繁琐语言模型的零摄像对话生成的研究受到限制。在本文中,我们提出了一个简单但有效的多语言学习框架,用于零拍对对话(称为mulzdg),该框架可以有效地将知识从带有大规模培训样本的英语语料库转移到具有零样本的非英语语料库。此外,MulzDG可以被视为一种多语言数据增强方法,以提高资源丰富的语言的性能。首先,我们通过从单语英文数据集随机选择的翻译说法来构建多语言代码转换对话数据集。然后,我们使用MulzDG来培训基于代码转换数据集的统一的多语言对话模型。 mulzdg可以在不同语言之间进行隐性的语义一致性。关于DailyDialog和DSTC7数据集的实验表明,与有足够示例的培训相比,MulzDG不仅在零击中的情况下实现竞争性能,而且还可以大大提高源语言的性能。
translated by 谷歌翻译
Complex dialogue mappings (CDM), including one-to-many and many-to-one mappings, tend to make dialogue models generate incoherent or dull responses, and modeling these mappings remains a huge challenge for neural dialogue systems. To alleviate these problems, methods like introducing external information, reconstructing the optimization function, and manipulating data samples are proposed, while they primarily focus on avoiding training with CDM, inevitably weakening the model's ability of understanding CDM in human conversations and limiting further improvements in model performance. This paper proposes a Sentence Semantic \textbf{Seg}mentation guided \textbf{C}onditional \textbf{V}ariational \textbf{A}uto-\textbf{E}ncoder (SegCVAE) method which can model and take advantages of the CDM data. Specifically, to tackle the incoherent problem caused by one-to-many, SegCVAE uses response-related prominent semantics to constrained the latent variable. To mitigate the non-diverse problem brought by many-to-one, SegCVAE segments multiple prominent semantics to enrich the latent variables. Three novel components, Internal Separation, External Guidance, and Semantic Norms, are proposed to achieve SegCVAE. On dialogue generation tasks, both the automatic and human evaluation results show that SegCVAE achieves new state-of-the-art performance.
translated by 谷歌翻译
目前,基于端到端深度学习的开放域对话系统仍然是黑匣子模型,使其易于与数据驱动的模型生成无关的内容。具体而言,由于缺乏指导培训的先验知识,潜在变量在潜在空间中与不同的语义纠缠在一起。为了解决这个问题,本文提议通过涉及介绍量表特征分离的认知方法来利用生成模型。特别是,该模型将宏观指导类别知识和微观级别的开放域对话数据集成到培训中,并将先验知识利用到潜在空间中,从而使模型能够将潜在变量置于介镜范围内的潜在变量。此外,我们为开放域对话提出了一个新的指标,可以客观地评估潜在空间分布的解释性。最后,我们在不同的数据集上验证了我们的模型,并在实验上证明我们的模型能够比其他模型产生更高的质量和更容易解释的对话。
translated by 谷歌翻译
当前的个性化对话中的作品主要有助于代理人表现出一致的个性并推动更有用的回应。但是,我们发现大多数以前模型的生成的响应往往是以自我为中心的,对话中的用户几乎不关心。此外,我们认为类似人类的对话基本上是基于推断另一方角色的信息而构建的。由此激励,我们通过检测隐性用户角色提出了一种新颖的个性化对话生成器。因为很难为每个用户收集大量详细的角色,所以我们试图对用户的潜在角色及其从对话历史记录进行建模,而没有外部知识。使用条件变异推断对感知和推子变量进行了构想。两个潜在变量模拟了人们意识到彼此角色并在对话中产生相应表达的过程。最后,提出了后歧视的正规化以增强训练程序。实证研究表明,与最先进的方法相比,我们的方法更关心用户的角色,并在整个评估中实现了相当大的推动力。
translated by 谷歌翻译
我们提出了一种新颖的体系结构,用于使用离散的潜在变量对以任务为导向的对话进行解释建模,以表示对话动作。我们的模型基于变异复发性神经网络(VRNN),不需要明确的语义信息注释。与以前的作品不同,我们的方法模型系统和用户单独转动并执行数据库查询建模,这使该模型适用于以任务为导向的对话,同时生成易于解释的可解释的可解释的潜在变量。我们表明,我们的模型在三个数据集中的困惑和BLEU方面优于先前的方法,我们提出了一种衡量对话成功的方法,而无需专家注释。最后,我们提出了一种新颖的方式来解释有关系统动作的潜在变量语义。
translated by 谷歌翻译
在过去的几年中,在各种文本生成任务中见证了各种自动编码器的优势。但是,由于文本的顺序性质,自动回归解码器倾向于忽略潜在变量,然后降低到简单的语言模型,称为KL消失的问题,当VAE与基于变压器的结构结合时,这将进一步恶化。为了改善这个问题,我们提出了一种新型变化变压器框架Della。德拉(Della)从较低层的层中得知一系列层的潜在变量,每个变量都从下层的层中推断出,并通过低级张量产品与隐藏状态紧密耦合。通过这种方式,Della强迫这些后部潜在变量将其与整个计算路径深入融合,从而结合了更多信息。从理论上讲,我们可以将我们的方法视为纠缠潜在变量,以避免通过层减少后验信息,从而使DELLA即使没有任何退火或阈值技巧,也可以使DELLA获得更高的非零KL值。与多个强大的基线相比,对四个无条件和三个条件生成任务的实验表明,Della可以更好地减轻KL消失并改善质量和多样性。
translated by 谷歌翻译
本文对过去二十年来对自然语言生成(NLG)的研究提供了全面的审查,特别是与数据到文本生成和文本到文本生成深度学习方法有关,以及NLG的新应用技术。该调查旨在(a)给出关于NLG核心任务的最新综合,以及该领域采用的建筑;(b)详细介绍各种NLG任务和数据集,并提请注意NLG评估中的挑战,专注于不同的评估方法及其关系;(c)强调一些未来的强调和相对近期的研究问题,因为NLG和其他人工智能领域的协同作用而增加,例如计算机视觉,文本和计算创造力。
translated by 谷歌翻译
预先接受训练的语言模型的最新进展具有显着改善的神经反应生成。但是,现有方法通常将对话背景视为令牌的线性序列,并通过令牌级自我关注学习生成下一个单词。这些令牌级编码阻碍了话语中话语水平一致性的探索。本文介绍了对话贝特,这是一种新的会话响应生成模型,可以增强以前的基于PLM的对话模型。 DialogBert采用分层变压器架构。为了有效地捕捉话语中的话语水平一致性,我们提出了两种培训目标,包括蒙面的话语回归和分布式话语秩序与原始BERT训练相比。在三个多转对谈话数据集上的实验表明,在定量评估方面,我们的方法非常优于BART和Dialogpt等基线。人类评估表明,DialogBert比具有显着利润率的基线产生更加连贯,信息和人类的反应。
translated by 谷歌翻译
Long-range context modeling is crucial to both dialogue understanding and generation. The most popular method for dialogue context representation is to concatenate the last-$k$ previous utterances. However, this method may not be ideal for conversations containing long-range dependencies. In this work, we propose DialoGX, a novel encoder-decoder based framework for conversational response generation with a generalized and explainable context representation that can look beyond the last-$k$ utterances. Hence the method is adaptive to conversations with long-range dependencies. The main idea of our approach is to identify and utilize the most relevant historical utterances instead of the last-$k$ utterances in chronological order. We study the effectiveness of our proposed method on both dialogue generation (open-domain) and understanding (DST) tasks. DialoGX achieves comparable performance with the state-of-the-art models on DailyDialog dataset. We also observe performance gain in existing DST models with our proposed context representation strategy on MultiWOZ dataset. We justify our context representation through the lens of psycholinguistics and show that the relevance score of previous utterances agrees well with human cognition which makes DialoGX explainable as well.
translated by 谷歌翻译
尽管条件变异自动编码器(CVAE)模型比传统的SEQ2SEQ模型可以产生更多的多样化响应,但响应通常与输入词的相关性低或与问题不合逻辑。进行因果分析以研究背后的原因,并提供了一种寻找调解人并减轻对话中混杂偏见的方法。具体而言,我们建议预测调解人,以保留相关信息,并自动将调解人纳入生成过程中。此外,动态主题图指导条件变异自动编码器(TGG-CVAE)模型用于补充语义空间并减少响应中的混杂偏置。广泛的实验表明,所提出的模型能够产生相关和信息性的响应,并且在自动指标和人类评估方面优于最先进的响应。
translated by 谷歌翻译
个性化对话的产生对于自然和人类的谈话至关重要。通常,个性化对话生成模型涉及对对话历史的生成响应以及对话者的角色/人格的表示。由于获得每个对话者的人格/人格表征是不切实际的,最近的作品已经探讨了通过将模型与对应于给定的人格对应的对话示例的模型来产生个性化对话的可能性。然而,在实际实现中,足够数量的相应对话示例也很少可用。因此,在本文中,我们提出了一种能够在没有任何人格/人格信息或任何相应的对话示例的情况下产生个性化对话的双潜变量发生器(DLVGEN)。与现有工作不同,DLVGEN模拟潜在响应的潜在分布以及代理商的潜在角色的潜在分布。在推理期间,从两个分布中采样潜在变量并进入解码器。经验结果表明,DLVGEN能够产生各种反应,精确地纳入代理人的角色。
translated by 谷歌翻译
As the functionality of dialogue systems evolves, hybrid dialogue systems that accomplish user-specific goals and participate in open-topic chitchat with users are attracting growing attention. Existing research learns both tasks concurrently utilizing a multi-task fusion technique but ignores the negative transfer phenomenon induced by the unique textual style differences. Therefore, contrastive learning based on the latent variable model is used to decouple the various textual genres in the latent space. We devise supervised and self-supervised positive and negative sample constructions for diverse datasets. In addition, to capitalize on the style information contained in the decoupled latent variables, we employ a style prefix that incorporates latent variables further to control the generation of responses with varying styles. We performed extensive experiments on three dialogue datasets, including a hybrid dialogue dataset and two task-oriented dialogue datasets. The experimental results demonstrate that our method can mitigate the negative style transfer issue and achieves state-of-the-art performance on multiple dialogue datasets.
translated by 谷歌翻译
开放域对话系统旨在以开放式的方式通过自然语言文本与人类互动。但是,广泛成功的神经网络可能对对话系统无法正常工作,因为它们倾向于产生通用响应。在这项工作中,我们提出了一个相等大小的艰难期望 - 最大化(EQHARD-EM)算法来训练多样化对话生成的多次模型。我们的算法以艰苦的方式将样品分配给解码器,并强加了等同的约束,以确保所有解码器都经过良好的训练。我们提供详细的理论分析以证明我们的方法是合理的。此外,对两个大规模开放域对话数据集进行了实验,验证了我们的eqhard-em算法是否会产生高质量的不同响应。
translated by 谷歌翻译
Personalized chatbots focus on endowing the chatbots with a consistent personality to behave like real users and further act as personal assistants. Previous studies have explored generating implicit user profiles from the user's dialogue history for building personalized chatbots. However, these studies only use the response generation loss to train the entire model, thus it is prone to suffer from the problem of data sparsity. Besides, they overemphasize the final generated response's quality while ignoring the correlations and fusions between the user's dialogue history, leading to rough data representations and performance degradation. To tackle these problems, we propose a self-supervised learning framework MCP for capturing better representations from users' dialogue history for personalized chatbots. Specifically, we apply contrastive sampling methods to leverage the supervised signals hidden in user dialog history, and generate the pre-training samples for enhancing the model. We design three pre-training tasks based on three types of contrastive pairs from user dialogue history, namely response pairs, sequence augmentation pairs, and user pairs. We pre-train the utterance encoder and the history encoder towards the contrastive objectives and use these pre-trained encoders for generating user profiles while personalized response generation. Experimental results on two real-world datasets show a significant improvement in our proposed model MCP compared with the existing methods.
translated by 谷歌翻译
Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.
translated by 谷歌翻译
医学对话生成是一项重要但具有挑战性的任务。以前的大多数作品都依赖于注意力机制和大规模预处理的语言模型。但是,这些方法通常无法从长时间的对话历史中获取关键信息,从而产生准确和信息丰富的响应,因为医疗实体通常散布在多种话语中以及它们之间的复杂关系。为了减轻此问题,我们提出了一个具有关键信息召回(Medpir)的医疗响应生成模型,该模型建立在两个组件上,即知识吸引的对话图形编码器和召回增强的生成器。知识吸引的对话图编码器通过利用话语中的实体之间的知识关系,并使用图形注意力网络对话图来构建对话图。然后,召回增强的发电机通过在产生实际响应之前生成对话的摘要来增强这些关键信息的使用。两个大型医学对话数据集的实验结果表明,Medpir在BLEU分数和医疗实体F1度量中的表现优于强大的基准。
translated by 谷歌翻译
预培训语言模型的浪潮一直不断提高机器生成的对话的质量,然而,一些产生的响应仍然遭受过度重复,有时重复从话语中重复单词,有时重复自我产生的响应中的单词,或者两个都。不当重复单词可以显着降低生成文本的质量。受到惩罚的采样是一种流行的解决方案,减少了推理期间现有词的采样概率,但是,它非常容易受到静态的不适当的设置。将其设置得太高可以产生奇怪和不切实际的句子,同时将其设置得太低,使得抑制重复微不足道的任务。要解决上述方法的缺点,我们设计了一个上下文感知的分类器,以明确决定何时允许重复和何时采用惩罚的采样。这种分类器可以容易地与现有的解码方法集成,在保持文本的分集的同时在适当的情况下减少重复。实验结果表明,我们的方法可以产生更高质量和更真实的对话。
translated by 谷歌翻译
这项工作结合了有关预先训练模型编码的对话历史的信息,其含义表示当前系统话语,以实现面向任务对话中的语境语言生成。我们利用预先训练的多上下文转换模型进行从头开始培训的模型中的上下文表示;并利用从预训练的GPT-2调整的模型中的上下文生成的立即使用前面的用户话语。与多种数据集的两个实验表明,通过预先训练的模型编码的上下文信息可提高自动指标和人类评估中的响应生成的性能。我们所呈现的上下文发电机使得更高种类的响应能够更好地适应正在进行的对话。分析上下文大小显示,较长的上下文不会自动导致更好的性能,但是前面的用户话语的直接对上下文生成起着重要作用。此外,我们还提出了一种基于GPT的生成模型的重新排名。实验表明,RE-Ranker选择的响应对自动度量有重大改进。
translated by 谷歌翻译