Morality in dialogue systems has raised great attention in research recently. A moral dialogue system could better connect users and enhance conversation engagement by gaining users' trust. In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems. In our framework, we first explore the communication mechanisms of morality and resolve expressed morality into four sub-modules. The sub-modules indicate the roadmap for building a moral dialogue system. Based on that, we design a simple yet effective method: constructing moral discussions from Rules of Thumb (RoTs) between simulated specific users and the dialogue system. The constructed discussion consists of expressing, explaining, and revising the moral views in dialogue exchanges, which makes conversational models learn morality well in a natural manner. Furthermore, we propose a novel evaluation method in the framework. We evaluate the multiple aspects of morality by judging the relation between dialogue responses and RoTs in discussions, where the multifaceted nature of morality is particularly considered. Automatic and manual experiments demonstrate that our framework is promising to train and evaluate moral dialogue systems.
translated by 谷歌翻译
Large pretrained language models can easily produce toxic or biased content, which is prohibitive for practical use. In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations. However, what type of context is more likely to induce unsafe responses is still under-explored. In this paper, we identify that context toxicity and context category (e.g., \textit{profanity}, \textit{insult}, \textit{drugs}, etc.) are two important factors to cause safety issues in response generation. Hence, we propose a method called \emph{reverse generation} to construct adversarial contexts conditioned on a given response, with the flexibility to control category, toxicity level, and inductivity of the generated contexts. Via reverse generation, we augment the existing BAD dataset and construct a new dataset BAD+ which contains more than 120K diverse and highly inductive contexts in 12 categories. We test three popular pretrained dialogue models (Blender, DialoGPT, and Plato2) and find that BAD+ can largely expose their safety problems. Furthermore, we show that BAD+ can greatly enhance the safety of generation and reveal the key factors of safety improvement. Our code and dataset is available at \url{https://github.com/thu-coai/Reverse_Generation}.
translated by 谷歌翻译
Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.
translated by 谷歌翻译
Conditional variational models, using either continuous or discrete latent variables, are powerful for open-domain dialogue response generation. However, previous works show that continuous latent variables tend to reduce the coherence of generated responses. In this paper, we also found that discrete latent variables have difficulty capturing more diverse expressions. To tackle these problems, we combine the merits of both continuous and discrete latent variables and propose a Hybrid Latent Variable (HLV) method. Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables. Thus, we diversify the generated responses while maintaining relevance and coherence. In addition, we propose Conditional Hybrid Variational Transformer (CHVT) to construct and to utilize HLV with transformers for dialogue generation. Through fine-grained symbolic-level semantic information and additive Gaussian mixing, we construct the distribution of continuous variables, prompting the generation of diverse expressions. Meanwhile, to maintain the relevance and coherence, the discrete latent variable is optimized by self-separation training. Experimental results on two dialogue generation datasets (DailyDialog and Opensubtitles) show that CHVT is superior to traditional transformer-based variational mechanism w.r.t. diversity, relevance and coherence metrics. Moreover, we also demonstrate the benefit of applying HLV to fine-tuning two pre-trained dialogue models (PLATO and BART-base).
translated by 谷歌翻译
Complex dialogue mappings (CDM), including one-to-many and many-to-one mappings, tend to make dialogue models generate incoherent or dull responses, and modeling these mappings remains a huge challenge for neural dialogue systems. To alleviate these problems, methods like introducing external information, reconstructing the optimization function, and manipulating data samples are proposed, while they primarily focus on avoiding training with CDM, inevitably weakening the model's ability of understanding CDM in human conversations and limiting further improvements in model performance. This paper proposes a Sentence Semantic \textbf{Seg}mentation guided \textbf{C}onditional \textbf{V}ariational \textbf{A}uto-\textbf{E}ncoder (SegCVAE) method which can model and take advantages of the CDM data. Specifically, to tackle the incoherent problem caused by one-to-many, SegCVAE uses response-related prominent semantics to constrained the latent variable. To mitigate the non-diverse problem brought by many-to-one, SegCVAE segments multiple prominent semantics to enrich the latent variables. Three novel components, Internal Separation, External Guidance, and Semantic Norms, are proposed to achieve SegCVAE. On dialogue generation tasks, both the automatic and human evaluation results show that SegCVAE achieves new state-of-the-art performance.
translated by 谷歌翻译
对话总是与某些主题有关。但是,由于预先训练的语言模型(PLM)的输入长度限制,在当前对话生成模型中同时将对话历史记录和主题信息融合在一起是具有挑战性的。为了扩展PLM可以使用的信息,我们使用具有多个融合中的频道(FID)的某些提示(FID)编码主题和对话历史信息信息,并探索三个不同频道设置的影响。在本文中,我们的实验集中在一个名为NaturalConv的特定中国数据集上,在该数据集中,对话围绕着最近的新闻。我们彻底比较了不同的对话模型和不同的FID频道设置。经验结果表明,通过将我们提出的整个通道与其他历史频道相结合,我们的方法可以在NaturalConv上实现竞争性能,从而可以从过长的文本中编码各种信息。
translated by 谷歌翻译
在本文中,我们介绍了基于大型预训练的语言模型(PLM)pangu-alpha(Zeng等,2021)的中国预训练的开放域对话生成模型。与其他对大量对话数据进行培训的预训练的对话模型不同,我们旨在通过继承PLM的有价值的语言能力和知识来构建强大的对话模型,并以相对较少的数据和计算成本构建强大的对话模型。为此,我们训练大型PLM Pangu-Alpha的Pangu-bot,该机器人已被证明在各种中国自然语言任务上表现出色。我们研究了pangu-bot产生的响应的不同方面,包括响应质量,知识和安全性。我们表明,Pangu-Bot优于最先进的中国对话系统(CDIALGPT(Wang等,2020),Eva(Zhou等,2021),EVA2.0(Gu等,2022)) W.R.T.以上三个方面。我们还证明,可以轻松地部署pangu-bot,以在没有进一步训练的情况下产生情感反应。在整个经验分析中,我们还指出,Pangu-bot响应质量,知识正确性和安全性仍然远非完美,进一步的探索对于建立可靠且智能的对话系统是必不可少的。我们的型号和代码将在https://github.com/huawei-noah/pretretaining-language-model/tree/master/master/pangu-bot上提供。
translated by 谷歌翻译
真实的人类对话数据是复杂,异质和嘈杂的,从该数据中构建开放域对话系统仍然是一项艰巨的任务。实际上,此类对话数据仍然包含大量信息和知识,但是,它们没有得到充分探索。在本文中,我们展示了现有的开放域对话生成方法,这些方法记住上下文响应配对的数据,并使用自动回归或编码模型模型不利于培训数据。与当前的方法不同,使用外部知识,我们探索了一个检索生成培训框架,该培训框架可以通过将它们视为“证据”来利用异质和嘈杂的培训数据。特别是,我们使用Bertscore进行检索,这给出了证据和一代的更好品质。公开可用数据集的实验表明,我们的方法可以帮助模型产生更好的响应,即使此类培训数据通常会留下深刻的印象为低质量数据。这种性能增益与通过扩大训练组更好的改进的绩效增益相当,甚至更好。我们还发现,模型性能与检索到的证据的相关性有正相关。此外,我们的方法在零拍实验上表现良好,这表明我们的方法对现实世界数据可能更强大。
translated by 谷歌翻译
在本文中,我们发现两个因素抑制POMS从实现高感感性质量:1)方向优化(COO)问题和2)模型的低频趋势。首先,POMS倾向于生成SR图像,其位置空间中的位置最接近所有潜在的高分辨率(HR)图像的分配中心,导致这种POMS失去高频细节。其次,图像的90美元\%$区域由低频信号组成;相比之下,人类感知依赖于图像的高频细节。然而,POMS应用相同的计算来处理不同频率区域,使POM倾向于恢复低频区域。基于这两个因素,我们提出了一种细节,通过组合高频增强模块和空间对比学习模块来降低COO问题的影响和低频趋势来提高对比损失(DECHROSTS)。实验结果表明,在若干常规SR模型上施加DROCKS时的效率和有效性。例如,在EDSR中,与基于GAN的方法相比,我们所提出的方法与视觉质量微妙降级的基于GAN的方法实现了3.60美元。此外,我们的最终结果表明,与最先进的方法相比,配备了我们的DECHROSS的SR网络更具现实和视觉上令人愉悦的纹理。 %拟议方法的源代码包含在补充材料中,并将在将来公开。
translated by 谷歌翻译
巨大的努力已经致力于创造高性能的少量学习者,即表现良好的培训数据的模型。培训大规模预训练语言模型(PLMS)产生了重大成本,但利用基于PLM的少量学习者由于其巨大尺寸而仍然具有挑战性。这项工作侧重于一个至关重要的问题:如何有效地利用这几个射门学习者?我们提出LMTurk,这是一种像众包工人一样对待几次射门学习者的新方法。理由是,众群工人实际上是几次学习者:他们被示出了一些说明性的例子来了解任务,然后开始注释。LMTurk聘请了几枪就是在PLMS作为工人的学习者。我们表明,由此产生的注释可以用来培训解决任务的模型,并且足够小,可以在实际情况下部署。完全,LMTurk是朝着有效利用当前PLM的少量学习者的重要一步。
translated by 谷歌翻译