Morality in dialogue systems has raised great attention in research recently. A moral dialogue system could better connect users and enhance conversation engagement by gaining users' trust. In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems. In our framework, we first explore the communication mechanisms of morality and resolve expressed morality into four sub-modules. The sub-modules indicate the roadmap for building a moral dialogue system. Based on that, we design a simple yet effective method: constructing moral discussions from Rules of Thumb (RoTs) between simulated specific users and the dialogue system. The constructed discussion consists of expressing, explaining, and revising the moral views in dialogue exchanges, which makes conversational models learn morality well in a natural manner. Furthermore, we propose a novel evaluation method in the framework. We evaluate the multiple aspects of morality by judging the relation between dialogue responses and RoTs in discussions, where the multifaceted nature of morality is particularly considered. Automatic and manual experiments demonstrate that our framework is promising to train and evaluate moral dialogue systems.
translated by 谷歌翻译
Code completion is a valuable topic in both academia and industry. Recently, large-scale mono-programming-lingual (MonoPL) pre-training models have been proposed to boost the performance of code completion. However, the code completion on low-resource programming languages (PL) is difficult for the data-driven paradigm, while there are plenty of developers using low-resource PLs. On the other hand, there are few studies exploring the effects of multi-programming-lingual (MultiPL) pre-training for the code completion, especially the impact on low-resource programming languages. To this end, we propose the MultiCoder to enhance the low-resource code completion via MultiPL pre-training and MultiPL Mixture-of-Experts (MoE) layers. We further propose a novel PL-level MoE routing strategy (PL-MoE) for improving the code completion on all PLs. Experimental results on CodeXGLUE and MultiCC demonstrate that 1) the proposed MultiCoder significantly outperforms the MonoPL baselines on low-resource programming languages, and 2) the PL-MoE module further boosts the performance on six programming languages. In addition, we analyze the effects of the proposed method in details and explore the effectiveness of our method in a variety of scenarios.
translated by 谷歌翻译
Existing pre-training methods for extractive Question Answering (QA) generate cloze-like queries different from natural questions in syntax structure, which could overfit pre-trained models to simple keyword matching. In order to address this problem, we propose a novel Momentum Contrastive pRe-training fOr queStion anSwering (MCROSS) method for extractive QA. Specifically, MCROSS introduces a momentum contrastive learning framework to align the answer probability between cloze-like and natural query-passage sample pairs. Hence, the pre-trained models can better transfer the knowledge learned in cloze-like samples to answering natural questions. Experimental results on three benchmarking QA datasets show that our method achieves noticeable improvement compared with all baselines in both supervised and zero-shot scenarios.
translated by 谷歌翻译
Previous studies have explored generating accurately lip-synced talking faces for arbitrary targets given audio conditions. However, most of them deform or generate the whole facial area, leading to non-realistic results. In this work, we delve into the formulation of altering only the mouth shapes of the target person. This requires masking a large percentage of the original image and seamlessly inpainting it with the aid of audio and reference frames. To this end, we propose the Audio-Visual Context-Aware Transformer (AV-CAT) framework, which produces accurate lip-sync with photo-realistic quality by predicting the masked mouth shapes. Our key insight is to exploit desired contextual information provided in audio and visual modalities thoroughly with delicately designed Transformers. Specifically, we propose a convolution-Transformer hybrid backbone and design an attention-based fusion strategy for filling the masked parts. It uniformly attends to the textural information on the unmasked regions and the reference frame. Then the semantic audio information is involved in enhancing the self-attention computation. Additionally, a refinement network with audio injection improves both image and lip-sync quality. Extensive experiments validate that our model can generate high-fidelity lip-synced results for arbitrary subjects.
translated by 谷歌翻译
Large pretrained language models can easily produce toxic or biased content, which is prohibitive for practical use. In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations. However, what type of context is more likely to induce unsafe responses is still under-explored. In this paper, we identify that context toxicity and context category (e.g., \textit{profanity}, \textit{insult}, \textit{drugs}, etc.) are two important factors to cause safety issues in response generation. Hence, we propose a method called \emph{reverse generation} to construct adversarial contexts conditioned on a given response, with the flexibility to control category, toxicity level, and inductivity of the generated contexts. Via reverse generation, we augment the existing BAD dataset and construct a new dataset BAD+ which contains more than 120K diverse and highly inductive contexts in 12 categories. We test three popular pretrained dialogue models (Blender, DialoGPT, and Plato2) and find that BAD+ can largely expose their safety problems. Furthermore, we show that BAD+ can greatly enhance the safety of generation and reveal the key factors of safety improvement. Our code and dataset is available at \url{https://github.com/thu-coai/Reverse_Generation}.
translated by 谷歌翻译
Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.
translated by 谷歌翻译
我们提出了Pangu-Coder,这是一种仅预读的解码器语言模型,该模型采用pangu-alpha架构进行文本到代码生成,即给定自然语言问题描述的编程语言解决方案的合成。我们使用两阶段策略训练Pangu-Coder:第一阶段采用因果语言建模(CLM)来预先培训原始编程语言数据,而第二阶段则使用因果语言建模和掩盖语言建模(MLM)的组合培训目标,专注于文本到代码生成的下游任务,并培训松散的自然语言程序定义和代码功能。最后,我们讨论了pangu-coder-ft,该pander the是通过竞争性编程问题和代码与持续集成测试的结合进行了微调的。我们评估了pangu-coder,重点是它是否生成功能上正确的程序,并证明它在参加较小的上下文窗口和较少的数据培训的同时,它比诸如Codex之类的类似大小的模型(例如Codex)实现等效性或更好的性能。
translated by 谷歌翻译
通过微调调整大型预训练模型(PTM)会施加过刺激的计算和存储负担。对参数有效调整(PET)的最新研究发现,与常规微调相比,仅优化以PTM为条件的一小部分参数才能产生PAR性能。通常,PET方法精确设计参数有效的模块(PET模块)可以应用于PTMS内部的任意细粒位置。但是,这些细粒度位置的有效性很大程度上依赖于复杂的手动指定,因此通常会产生次优的结果。与手动指定相反,我们以自动方式探索构建宠物模块。我们将自动\ textbf {s} earch \ textbf {s} parse \ textbf {s} \ textbf {p} arameter- \ textbf {e} fficbf {e} fficient \ textbf {t textbf {t} uning(s $^3 $ pet) 。基于各种PET方法的统一框架,S $^3 $ PET通过双层优化进行了可区分的PET结构搜索,并提出了移动的全局Sigmoid方法,以明确控制可训练的参数的数量。广泛的实验表明,S $^3 $ PET超过了具有较低训练参数的手册和随机结构。搜索结构可保留99 \%的微调性能,具有0.01 \%可训练的参数。此外,S $^3 $ PET的优势通过极低的训练参数预算(0.0009 \%$ \ sim $ 0.01 \%)进行扩增。搜索结构是可转移和解释的,为PET方法的未来设计提供了建议和指导。
translated by 谷歌翻译
在本文中,我们介绍了基于大型预训练的语言模型(PLM)pangu-alpha(Zeng等,2021)的中国预训练的开放域对话生成模型。与其他对大量对话数据进行培训的预训练的对话模型不同,我们旨在通过继承PLM的有价值的语言能力和知识来构建强大的对话模型,并以相对较少的数据和计算成本构建强大的对话模型。为此,我们训练大型PLM Pangu-Alpha的Pangu-bot,该机器人已被证明在各种中国自然语言任务上表现出色。我们研究了pangu-bot产生的响应的不同方面,包括响应质量,知识和安全性。我们表明,Pangu-Bot优于最先进的中国对话系统(CDIALGPT(Wang等,2020),Eva(Zhou等,2021),EVA2.0(Gu等,2022)) W.R.T.以上三个方面。我们还证明,可以轻松地部署pangu-bot,以在没有进一步训练的情况下产生情感反应。在整个经验分析中,我们还指出,Pangu-bot响应质量,知识正确性和安全性仍然远非完美,进一步的探索对于建立可靠且智能的对话系统是必不可少的。我们的型号和代码将在https://github.com/huawei-noah/pretretaining-language-model/tree/master/master/pangu-bot上提供。
translated by 谷歌翻译
真实的人类对话数据是复杂,异质和嘈杂的,从该数据中构建开放域对话系统仍然是一项艰巨的任务。实际上,此类对话数据仍然包含大量信息和知识,但是,它们没有得到充分探索。在本文中,我们展示了现有的开放域对话生成方法,这些方法记住上下文响应配对的数据,并使用自动回归或编码模型模型不利于培训数据。与当前的方法不同,使用外部知识,我们探索了一个检索生成培训框架,该培训框架可以通过将它们视为“证据”来利用异质和嘈杂的培训数据。特别是,我们使用Bertscore进行检索,这给出了证据和一代的更好品质。公开可用数据集的实验表明,我们的方法可以帮助模型产生更好的响应,即使此类培训数据通常会留下深刻的印象为低质量数据。这种性能增益与通过扩大训练组更好的改进的绩效增益相当,甚至更好。我们还发现,模型性能与检索到的证据的相关性有正相关。此外,我们的方法在零拍实验上表现良好,这表明我们的方法对现实世界数据可能更强大。
translated by 谷歌翻译