Despite their widespread adoption, neural conversation models have yet to exhibit natural chat capabilities with humans. In this research, we examine user utterances as causes and generated responses as effects, recognizing that changes in a cause should produce a different effect. To further explore this concept, we have compiled and expanded upon a new dataset called CausalDialogue through crowd-sourcing. This dataset includes multiple cause-effect pairs within a directed acyclic graph (DAG) structure. Our analysis reveals that traditional loss functions can struggle to effectively incorporate the DAG structure, leading us to propose a causality-enhanced method called Exponential Maximum Average Treatment Effect (ExMATE) to enhance the impact of causality at the utterance level in training neural conversation models. To evaluate the effectiveness of this approach, we have built a comprehensive benchmark using the CausalDialogue dataset leveraging large-scale pre-trained language models, and have assessed the results through both human and automatic evaluation metrics for coherence, diversity, and agility. Our findings show that current techniques are still unable to effectively address conversational DAGs, and that the ExMATE method can improve the diversity and agility of conventional loss functions while maintaining coherence.
translated by 谷歌翻译
创建可以对对话做出适当反应又理解复杂人类语言倾向和社会线索的代理人在NLP社区中一直是一项艰巨的挑战。最近的研究支柱围绕着对话中的情感识别(ERC);情感识别的子场地,重点是包含两个或更多话语的对话或对话。在这项工作中,我们探讨了一种ERC的方法,该方法利用了对话中神经嵌入的使用以及复杂的结构。我们在称为概率软逻辑(PSL)的框架中实现了我们的方法,该框架是一种使用一阶逻辑规则的声明的模板语言,该语言与数据结合时,定义了特定类别的图形模型。此外,PSL为将神经模型的结果纳入PSL模型提供了功能。这使我们的模型可以利用先进的神经方法,例如句子嵌入以及对话结构的逻辑推理。我们将我们的方法与最先进的纯神经ERC系统进行了比较,并将几乎提高了20%。通过这些结果,我们对DailyDialog对话数据集提供了广泛的定性和定量分析。
translated by 谷歌翻译
我们提出了神经概率软逻辑(NEUPSL),这是一种新型的神经符号(NESY)框架,将最新的象征性推理与对深神经网络的低水平感知结合在一起。为了明确建模神经和符号表示之间的边界,我们引入了基于NESY Energy模型,这是一个结合神经和符号推理的基于能量的一般模型。使用此框架,我们展示了如何无缝整合神经和符号参数学习和推理。我们进行广泛的经验评估,并表明NEUPSL优于关节推断的现有方法,并且在几乎所有设置中的差异都显着降低。
translated by 谷歌翻译
Position modeling plays a critical role in Transformers. In this paper, we focus on length extrapolation, i.e., training on short texts while evaluating longer sequences. We define attention resolution as an indicator of extrapolation. Then we propose two designs to improve the above metric of Transformers. Specifically, we introduce a relative position embedding to explicitly maximize attention resolution. Moreover, we use blockwise causal attention during inference for better resolution. We evaluate different Transformer variants with language modeling. Experimental results show that our model achieves strong performance in both interpolation and extrapolation settings. The code will be available at https://aka.ms/LeX-Transformer.
translated by 谷歌翻译
Business processes that involve AI-powered automation have been gaining importance and market share in recent years. These business processes combine the characteristics of classical business process management, goal-driven chatbots, conversational recommendation systems, and robotic process automation. In the new context, prescriptive process monitoring demands innovative approaches. Unfortunately, data logs from these new processes are still not available in the public domain. We describe the main challenges in this new domain and introduce a synthesized dataset that is based on an actual use case of intelligent process automation with chatbot orchestration. Using this dataset, we demonstrate crowd-wisdom and goal-driven approaches to prescriptive process monitoring.
translated by 谷歌翻译
Out-of-distribution (OOD) detection has attracted a large amount of attention from the machine learning research community in recent years due to its importance in deployed systems. Most of the previous studies focused on the detection of OOD samples in the multi-class classification task. However, OOD detection in the multi-label classification task remains an underexplored domain. In this research, we propose YolOOD - a method that utilizes concepts from the object detection domain to perform OOD detection in the multi-label classification task. Object detection models have an inherent ability to distinguish between objects of interest (in-distribution) and irrelevant objects (e.g., OOD objects) on images that contain multiple objects from different categories. These abilities allow us to convert a regular object detection model into an image classifier with inherent OOD detection capabilities with just minor changes. We compare our approach to state-of-the-art OOD detection methods and demonstrate YolOOD's ability to outperform these methods on a comprehensive suite of in-distribution and OOD benchmark datasets.
translated by 谷歌翻译
We present the UC$^3$RL algorithm for regret minimization in Stochastic Contextual MDPs (CMDPs). The algorithm operates under the minimal assumptions of realizable function class, and access to offline least squares and log loss regression oracles. Our algorithm is efficient (assuming efficient offline regression oracles) and enjoys an $\widetilde{O}(H^3 \sqrt{T |S| |A|(\log (|\mathcal{F}|/\delta) + \log (|\mathcal{P}|/ \delta) )})$ regret guarantee, with $T$ being the number of episodes, $S$ the state space, $A$ the action space, $H$ the horizon, and $\mathcal{P}$ and $\mathcal{F}$ are finite function classes, used to approximate the context-dependent dynamics and rewards, respectively. To the best of our knowledge, our algorithm is the first efficient and rate-optimal regret minimization algorithm for CMDPs, which operates under the general offline function approximation setting.
translated by 谷歌翻译
We address the general task of structured commonsense reasoning: given a natural language input, the goal is to generate a graph such as an event -- or a reasoning-graph. To employ large language models (LMs) for this task, existing approaches ``serialize'' the output graph as a flat list of nodes and edges. Although feasible, these serialized graphs strongly deviate from the natural language corpora that LMs were pre-trained on, hindering LMs from generating them correctly. In this paper, we show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language, even when the downstream task does not involve source code at all. We demonstrate our approach across three diverse structured commonsense reasoning tasks. In all these natural language tasks, we show that using our approach, a code generation LM (CODEX) outperforms natural-LMs that are fine-tuned on the target task (e.g., T5) and other strong LMs such as GPT-3 in the few-shot setting.
translated by 谷歌翻译
近年来,出于计算机视觉目的,将图像传输到远程服务器的传输急剧增加。在许多应用程序(例如监视)中,图像主要是用于自动分析的,并且很少被人类看到。在这种情况下,使用传统的压缩在比特率方面效率低下,这可能是由于关注基于人类的失真指标。因此,重要的是创建特定的图像编码方法,以供人类和机器联合使用。创建这种编解码器的机器侧的一种方法是在深神经网络中执行某些中间层执行机器任务的功能匹配。在这项工作中,我们探讨了用于培训人类和机器可学习的编解码器时所使用的层选择的效果。我们证明,使用数据处理不平等,从速率延伸的意义上讲,更深层的匹配特征是可取的。接下来,我们通过重新培训现有的可扩展人机编码模型来从经验上确认我们的发现。在我们的实验中,我们显示了这种可扩展模型的人类和机器方面的权衡,并讨论了在这方面使用更深层进行训练的好处。
translated by 谷歌翻译
我们介绍了IST和Unmabel对WMT 2022关于质量估计(QE)的共享任务的共同贡献。我们的团队参与了所有三个子任务:(i)句子和单词级质量预测;(ii)可解释的量化宽松;(iii)关键错误检测。对于所有任务,我们在彗星框架之上构建,将其与OpenKIWI的预测估计架构连接,并为其配备单词级序列标记器和解释提取器。我们的结果表明,在预处理过程中合并参考可以改善下游任务上多种语言对的性能,并且通过句子和单词级别的目标共同培训可以进一步提高。此外,将注意力和梯度信息结合在一起被证明是提取句子级量化量化宽松模型的良好解释的首要策略。总体而言,我们的意见书在几乎所有语言对的所有三个任务中都取得了最佳的结果。
translated by 谷歌翻译