最近的机器阅读理解数据集包括提取和布尔值问题,但当前的方法并未为回答这两种问题类型提供综合支持。我们提出了一个多语言的机器阅读理解系统和前端演示,该演示通过提供“是/否答案”并突出支持证据,并通过突出段落中的答案来处理提取性问题,从而解决布尔值。在撰写本文时,我们的系统GAAMA 2.0在TYDI QA排行榜上排名第一。我们对比了我们方法的两种不同的实现。第一个包括几个独立的变压器堆栈,可以轻松部署每个组件。第二个是使用适配器来减少资源约束环境中GPU内存足迹的单一堆栈。
translated by 谷歌翻译
我们提出了一种基于转换的系统来转换摘要意义代表(AMR)进入SPARQL,了解知识库问题应答(KBQA)。这允许将抽象问题的一部分委派给强训练的语义解析器,同时使用少量配对数据学习转换。我们从最近的工作相关的AMR和SPARQL构造,而不是应用一套规则,我们教导BART模型选择性地使用这些关系。此外,在最近的语义解析作品之后,我们避免在BART的注意机制中进行了显式编码AMR,而是编码解析器状态。结果模型很简单,为其决策提供支持文本,并且优于LC-Quad(F1 53.4)中的基于AMR的KBQA中的最新进展,在QAL(F1 30.8)中匹配,同时利用相同的归纳偏差。
translated by 谷歌翻译
包含布尔问题的现有数据集(如Booolq和Tydi QA)为用户提供对问题的是/否响应。然而,一个单词响应不足以可说明的系统。我们通过释放一组标记现有TYDI QA和Booolq数据集的证据的新辅助来促进解释性。我们表明,与依赖现有资源的模型相比,我们的注释可用于培训提取改进证据跨度的模型。我们通过用户学习确认我们的调查结果表明我们提取的证据涵盖了增强用户体验。我们还提供进一步了解回答布尔问题的挑战,例如包含冲突的是和无答案的段落,以及预测证据的不同程度。
translated by 谷歌翻译
We present a novel corpus for French dialect identification comprising 413,522 French text samples collected from public news websites in Belgium, Canada, France and Switzerland. To ensure an accurate estimation of the dialect identification performance of models, we designed the corpus to eliminate potential biases related to topic, writing style, and publication source. More precisely, the training, validation and test splits are collected from different news websites, while searching for different keywords (topics). This leads to a French cross-domain (FreCDo) dialect identification task. We conduct experiments with four competitive baselines, a fine-tuned CamemBERT model, an XGBoost based on fine-tuned CamemBERT features, a Support Vector Machines (SVM) classifier based on fine-tuned CamemBERT features, and an SVM based on word n-grams. Aside from presenting quantitative results, we also make an analysis of the most discriminative features learned by CamemBERT. Our corpus is available at https://github.com/MihaelaGaman/FreCDo.
translated by 谷歌翻译
Causal deep learning (CDL) is a new and important research area in the larger field of machine learning. With CDL, researchers aim to structure and encode causal knowledge in the extremely flexible representation space of deep learning models. Doing so will lead to more informed, robust, and general predictions and inference -- which is important! However, CDL is still in its infancy. For example, it is not clear how we ought to compare different methods as they are so different in their output, the way they encode causal knowledge, or even how they represent this knowledge. This is a living paper that categorises methods in causal deep learning beyond Pearl's ladder of causation. We refine the rungs in Pearl's ladder, while also adding a separate dimension that categorises the parametric assumptions of both input and representation, arriving at the map of causal deep learning. Our map covers machine learning disciplines such as supervised learning, reinforcement learning, generative modelling and beyond. Our paradigm is a tool which helps researchers to: find benchmarks, compare methods, and most importantly: identify research gaps. With this work we aim to structure the avalanche of papers being published on causal deep learning. While papers on the topic are being published daily, our map remains fixed. We open-source our map for others to use as they see fit: perhaps to offer guidance in a related works section, or to better highlight the contribution of their paper.
translated by 谷歌翻译
It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences. Fair ML has largely focused on the protection of single attributes in the simpler setting where both attributes and target outcomes are binary. However, the practical application in many a real-world problem entails the simultaneous protection of multiple sensitive attributes, which are often not simply binary, but continuous or categorical. To address this more challenging task, we introduce FairCOCCO, a fairness measure built on cross-covariance operators on reproducing kernel Hilbert Spaces. This leads to two practical tools: first, the FairCOCCO Score, a normalised metric that can quantify fairness in settings with single or multiple sensitive attributes of arbitrary type; and second, a subsequent regularisation term that can be incorporated into arbitrary learning objectives to obtain fair predictors. These contributions address crucial gaps in the algorithmic fairness literature, and we empirically demonstrate consistent improvements against state-of-the-art techniques in balancing predictive power and fairness on real-world datasets.
translated by 谷歌翻译
While there have been a number of remarkable breakthroughs in machine learning (ML), much of the focus has been placed on model development. However, to truly realize the potential of machine learning in real-world settings, additional aspects must be considered across the ML pipeline. Data-centric AI is emerging as a unifying paradigm that could enable such reliable end-to-end pipelines. However, this remains a nascent area with no standardized framework to guide practitioners to the necessary data-centric considerations or to communicate the design of data-centric driven ML systems. To address this gap, we propose DC-Check, an actionable checklist-style framework to elicit data-centric considerations at different stages of the ML pipeline: Data, Training, Testing, and Deployment. This data-centric lens on development aims to promote thoughtfulness and transparency prior to system development. Additionally, we highlight specific data-centric AI challenges and research opportunities. DC-Check is aimed at both practitioners and researchers to guide day-to-day development. As such, to easily engage with and use DC-Check and associated resources, we provide a DC-Check companion website (https://www.vanderschaar-lab.com/dc-check/). The website will also serve as an updated resource as methods and tooling evolve over time.
translated by 谷歌翻译
在许多情况下,更简单的模型比更复杂的模型更可取,并且该模型复杂性的控制是机器学习中许多方法的目标,例如正则化,高参数调整和体系结构设计。在深度学习中,很难理解复杂性控制的潜在机制,因为许多传统措施并不适合深度神经网络。在这里,我们开发了几何复杂性的概念,该概念是使用离散的dirichlet能量计算的模型函数变异性的量度。使用理论论据和经验结果的结合,我们表明,许多常见的训练启发式方法,例如参数规范正规化,光谱规范正则化,平稳性正则化,隐式梯度正则化,噪声正则化和参数初始化的选择,都可以控制几何学复杂性,并提供一个统一的框架,以表征深度学习模型的行为。
translated by 谷歌翻译
我们研究并介绍了复杂和双色复合物环境中的新梯度运算符,这是受自适应线性神经元(Adaline)在1960年发明的著名的最少均等(LMS)算法的启发。这些梯度运算符将用于制定最小二平方(BLM)算法的新学习规则。这种方法既扩展了经典的真实和复杂的LMS算法。
translated by 谷歌翻译
基于概念的解释允许通过用户指定的概念镜头来了解深神经网络(DNN)的预测。现有方法假设说明概念的示例是在DNN潜在空间的固定方向上映射的。当这种情况下,该概念可以用指向该方向的概念激活向量(CAV)表示。在这项工作中,我们建议通过允许概念示例散布在DNN潜在空间中的不同群集中来放松这一假设。然后,每个概念都由DNN潜在空间的区域表示,该区域包括这些簇,我们称为概念激活区域(CAR)。为了使这个想法形式化,我们介绍了基于内核技巧和支持向量分类器的CAV形式主义的扩展。这种汽车形式主义产生了基于全球概念的解释和基于本地概念的特征重要性。我们证明,用径向核建造的汽车解释在潜在空间等法下是不变的。这样,汽车将相同的解释分配给具有相同几何形状的潜在空间。我们进一步证明汽车提供(1)更准确地描述了概念如何散布在DNN的潜在空间中; (2)更接近人类概念注释和(3)基于概念的特征的重要性重要性的全球解释,这些特征的重要性是有意义地相互关联的。最后,我们使用汽车表明DNN可以自主重新发现已知的科学概念,例如前列腺癌分级系统。
translated by 谷歌翻译