在这项工作中,我们专注于有效地利用和整合来自概念层面和词汇层面的信息,通过将概念和文字投影到较低维空间,同时保留最关键的语义。在舆论理解系统的广泛背景下,我们研究了融合嵌入在若干核心NLP任务中的使用:命名实体检测和分类,自动语音识别重新排名和有针对性的情感分析。
translated by 谷歌翻译
随着信息技术的迅速发展,与健康相关的数据为医学和健康发现带来了前所未有的潜力,同时也是机器学习技术在规模和复杂性方面的重大挑战。这些挑战包括:具有各种存储格式的结构化数据和由异构数据源引起的值类型;医学诊断和治疗各方面广泛存在的不确定性;特征空间的高维度;纵向医疗记录数据,相邻观察之间的间隔不规则;具有相似遗传因素,位置或社会人口背景的物体之间存在丰富的关系。本文旨在开发先进的统计关系学习方法,以有效地利用这些与健康相关的数据,促进医学研究的发现。它介绍了挖掘结构化不平衡数据的成本敏感统计关系学习的工作,第一个用于预测纵向结构化数据的连续事件的连续时间概率逻辑模型,以及用于从异构结构数据中学习的混合概率关系模型。它还展示了这些提出的模型以及其他最先进的机器学习模型在应用于医学研究问题和其他现实世界的大型系统时的出色表现,揭示了统计关系学习在探索结构化健康相关数据方面的巨大潜力促进医学研究。
translated by 谷歌翻译
In this work we take a first step towards Learning from Natural Instructions (LNI), a framework for communicating human knowledge to computer systems using natural language. In this framework the process of learning is synonymous with language interpretation, the process in which natural language sentences are converted into a logical representation which can be understood by an automated agent. While the motivation behind this framework is clear, the practical aspects involved in constructing it are non-trivial: communicating effectively with computer systems has been one of motivating forces behind artificial intelligence research since its inception. The rigid way in which computer systems naturally take instructions, via programming, and the flexible and ambiguous way in which humans naturally provide instructions, via natural language, rendered this task extremely difficult. At the heart of this work stands the problem of semantic interpretation, viewed through the perspective of LNI. In this work we consider realistic settings in which LNI is applicable, by framing semantic interpretation as a machine learning problem and suggesting training protocols that help facilitate LNI and reduce the level of human effort involved. Most current works view semantic interpretation as a structured learning problem, in which a struc-tured predictor is trained in a supervised manner using pairs of sentences and their corresponding meaning interpretations expressed as a logical formula. This type of annotation is extremely costly. Moreover, the training process is domain dependent, resulting in a task specific interpreter. The LNI settings call for a more flexible process, as the communicated knowledge is not limited to a single task. The supervised settings described above require repeating the training process from scratch when approaching a new domain. Alleviating these problems is a necessary step towards effective LNI. In this work we suggest an alternative to learning in the supervised settings. Our solution is situated in the LNI settings and exploits a simple observation: the communicated knowledge should result in an observable change of the agent's behavior. We move away from the traditional supervised learning settings by using a feedback signal derived by exploiting this observation: desirable changes in behavior are considered as positive feedback and the vice versa. Consider, for example, teaching an automated agent ii the rules to a game by providing it with natural language explanations of the game rules. Successful interpretation of these rules would result with the agent possessing the ability to make correct decisions in a game scenario. By observing the agent's behavior and using it as feedback we can reinforce or penalize the semantic interpretation model leading to the observed behavior. In contrast to the supervised settings, in which learning depends on annotated data in the form of pairs of sentences and the corresponding logical s
translated by 谷歌翻译
A recent ''third wave'' of neural network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work in this area is often referred to as deep learning. Recent years have witnessed an explosive growth of research into NN-based approaches to information retrieval (IR). A significant body of work has now been created. In this paper, Kezban Dilek Onal and Ye Zhang contributed equally. Maarten de Rijke and Matthew Lease contributed equally. we survey the current landscape of Neural IR research, paying special attention to the use of learned distributed representations of textual units. We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research.
translated by 谷歌翻译
translated by 谷歌翻译
我们对本文的兴趣在于构建由深度神经网络进行预测的符号解释。我们将把注意力集中在深层关系机器(DRM,由H. Lodhi首先提出)上。 DRM是一个深层网络,输入层由布尔值函数(特征)组成,这些函数根据作为域或背景知识提供的关系定义。我们的DRM与Lodhi提出的那些不同,后者使用归纳逻辑编程(ILP) )引擎首先选择特征(我们使用从满足逻辑相关和非冗余的一些近似约束的特征空间中随机选择)。但为什么DRM预测他们做了什么?回答这个问题的唯一方法是LIME设置,其中ablack-box预测器的可读代理。代理仅用于对实例空间的局部区域中的黑盒的预测进行建模。但仅仅可读性可能还不够:可以理解,本地模型必须以有意义的方式使用相关概念。我们研究使用类似贝叶斯的方法来识别DRM的本地预测的逻辑代理。我们表明:(a)DRM与我们的随机命题化方法实现了最先进的预测性能; (b)一阶逻辑模型可以在一个小的局部区域内近似地逼近DRM的预测; (c)专家提供的相关信息可以在区分单独执行预测的逻辑解释之前发挥作用。
translated by 谷歌翻译
Many databases contain imprecise references to real-world entities. For example , a social-network database records names of people. But different people can go by the same name and there may be different observed names referring to the same person. The goal of entity resolution is to determine the mapping from database references to discovered real-world entities. Traditional entity resolution approaches consider approximate matches between attributes of individual references, but this does not always work well. In many domains, such as social networks and academic circles, the underlying entities exhibit strong ties to each other, and as a result, their references often co-occur in the data. In this dissertation, I focus on the use of such co-occurrence relationships for jointly resolving entities. I refer to this problem as 'collective entity resolution'. First, I propose a relational clustering algorithm for iteratively discovering entities by clustering references taking into account the clusters of co-occurring references. Next, I propose a probabilistic generative model for collective resolution that finds hidden group structures among the entities and uses the latent groups as evidence for entity resolution. One of my contributions is an efficient unsupervised inference algorithm for this model using Gibbs Sampling techniques that discovers the most likely number of entities. Both of these approaches improve performance over attribute-only baselines in multiple real world and synthetic datasets. I also perform a theoretical analysis of how the structural properties of the data affect collective entity resolution and verify the predicted trends experimentally. In addition, I motivate the problem of query-time entity resolution. I propose an adaptive algorithm that uses collective resolution for answering queries by recursively exploring and resolving related references. This enables resolution at query-time, while preserving the performance benefits of collective resolution. Finally, as an application of entity resolution in the domain of natural language processing, I study the sense dis-ambiguation problem and propose models for collective sense disambiguation using multiple languages that outperform other unsupervised approaches.
translated by 谷歌翻译
近年来,复杂文档和文本的数量呈指数增长,需要更深入地了解机器学习方法,才能在许多应用程序中准确地对文本进行分类。许多机器学习方法在自然语言处理方面取得了超越的成果。这些学习算法的成功依赖于它们能够理解数据中的复杂模型和非线性关系。然而,为文本分类找到合适的结构,体系结构和技术对研究人员来说是一个挑战。在本文中,讨论了文本分类算法的简要概述。本概述涵盖了不同的文本特征提取,降维方法,现有算法和技术以及评估方法。最后,讨论了每种技术的局限性及其在现实问题中的应用。
translated by 谷歌翻译
在过去几年中,神经网络重新成为强大的机器学习模型,在图像识别和语音处理等领域产生了最先进的结果。最近,神经网络模型开始应用于文本自然语言信号,同样具有非常有希望的结果。本教程从自然语言处理研究的角度对神经网络模型进行了调查,试图通过神经技术使自然语言研究人员加快速度。本教程介绍了自然语言任务,前馈网络,卷积网络,循环网络和递归网络的输入编码,以及自动梯度计算的计算图形抽象。
translated by 谷歌翻译
人工智能(AI)最近经历了复兴,在视觉,语言,控制和决策等关键领域取得了重大进展。部分原因在于廉价数据和廉价的计算资源,它们符合深度学习的自然优势。然而,在许多不同压力下发展的人类智能的许多定义特征仍然是当前方法所无法实现的。特别是,超越一个人的经验 - 从人类智慧中获得人类智慧的标志 - 仍然是现代人工智能的一项艰巨挑战。以下是部分立场文件,部分审查和部分统一。认为组合概括必须是人工智能达到人类能力的首要任务,结构化表征和计算是实现这一目标的关键。就像生物学使用自然和培养合作一样,我们拒绝“手工工程”和“端到端”学习之间的错误选择,而是主张从其互补优势中获益的方法。我们探索在深度学习架构中如何使用关系归纳偏差可以促进对实体,关系和组成它们的规则的学习。我们为AI工具包提供了一个新的构建模块,它具有强大的关系引导偏差 - 图形网络 - 它概括和扩展了在图形上运行的神经网络的各种方法,并提供了一个简单的界面来操纵结构化知识和生产结构化行为。我们讨论图网络如何支持关系推理和组合泛化,为更复杂,可解释和灵活的推理模式奠定基础。作为本文的参考,我们发布了一个用于构建图形网络的开源软件库,并演示了如何在实践中使用它们。
translated by 谷歌翻译
Embedding network data into a low-dimensional vector space has shown promising performance for many real-world applications, such as node classification and entity retrieval. However, most existing methods focused only on leveraging network structure. For social networks, besides the network structure, there also exists rich information about social actors, such as user profiles of friendship networks and textual content of citation networks. These rich attribute information of social actors reveal the homophily effect, exerting huge impacts on the formation of social networks. In this paper, we explore the rich evidence source of attributes in social networks to improve network embedding. We propose a generic Social Network Embedding framework (SNE), which learns representations for social actors (i.e., nodes) by preserving both the structural proximity and attribute proximity. While the structural proximity captures the global network structure, the attribute proximity accounts for the homophily effect. To justify our proposal, we conduct extensive experiments on four real-world social networks. Compared to the state-of-the-art network embedding approaches, SNE can learn more informative representations, achieving substantial gains on the tasks of link prediction and node classification. Specifically, SNE significantly outperforms node2vec with an 8.2% relative improvement on the link prediction task, and a 12.7% gain on the node classification task.
translated by 谷歌翻译
我们提出了神经逻辑机器(NLM),一种用于归纳学习和逻辑推理的神经符号体系结构。 NLM利用两种神经网络的功能 - 作为函数逼近器和逻辑编程 - 作为具有属性,关系,逻辑连接词和量词的对象的符号处理器。在接受小规模任务(例如sortingshort数组)的训练之后,NLM可以恢复提升的规则,并推广到大规模任务(例如排序更长的数组)。在我们的实验中,NLM在许多任务中实现了完美的一般化,从家庭树和一般图的关系推理任务到决策制定任务,包括排序阵列,寻找最短路径和玩块世界。对于神经网络或单独的归纳逻辑编程来说,大多数这些任务很难实现。
translated by 谷歌翻译
关系机器学习研究方法用于关联或图形结构数据的统计分析。在本文中,我们提供了一个回顾,可以在大型知识图上“训练”这样的统计模型,然后用于预测关于世界的新事实(这相当于预测图中的新边缘)。特别是,我们讨论了两种基本上不同的统计关系模型,这两种模型都可以扩展到大规模数据集。第一种是基于潜在特征模型,如张量因子化和多路神经网络。第二个是基于图中的miningobservable模式。我们还展示了如何将这些潜在的和可观察的模型结合起来,以降低计算成本来提高建模能力。最后,我们讨论了如何将这种图形统计模型与基于文本的信息提取方法相结合,以便从Web自动构建知识图。为此,我们还将Google的KnowledgeVault项目作为此类组合的一个示例进行讨论。
translated by 谷歌翻译
Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, comes at the expense of a more complex model-selection problem: an unbounded number of relational abstraction levels might need to be explored. Whereas current learning approaches for RDNs learn a single probability tree per random variable, we propose to turn the problem into a series of relational function-approximation problems using gradient-based boosting. In doing so, one can easily induce highly complex features over several iterations and in turn estimate quickly a very expressive model. Our experimental results in several different data sets show that this boosting method results in efficient learning of RDNs when compared to state-of-the-art statistical relational learning approaches.
translated by 谷歌翻译
从自动驾驶车辆和倒车机器人到虚拟助手,我们下一次在美发沙龙或在那家餐厅用餐 - 机器学习系统越来越普遍。这样做的主要原因是这些方法具有非凡的预测能力。然而,这些模型中的大多数仍然是黑盒子,这意味着人类追随并理解其错综复杂的内部运作是非常具有挑战性的。因此,在这种日益复杂的复杂性下,可解释性受到了影响。机器学习模型。特别是对于新规则,例如通用数据保护条例(GDPR),这些黑箱所做出的合理性和可预测性的必要性是不可或缺的。在行业和实践需求的推动下,研究界已经认识到这种可解释性问题,并着重于在过去的几年中开发出越来越多的所谓解释方法。这些方法解释了黑盒机器学习模型所做的个人预测,并有助于恢复一些丢失的可解释性。然而,随着这些解释方法的扩散,通常不清楚哪种解释方法提供更高的解释质量,或者通常更适合于手头的情况。因此,在本论文中,我们提出了anaxiomatic框架,它允许比较不同平台方法的质量。通过实验验证,我们发现开发的框架有助于评估不同解释方法的解释质量,并得出在独立研究中一致的结论。
translated by 谷歌翻译
We develop a general theoretical framework for statistical logical learning with kernels based on dynamic propositionalization, where structure learning corresponds to inferring a suitable kernel on logical objects, and parameter learning corresponds to function learning in the resulting reproducing kernel Hilbert space. In particular, we study the case where structure learning is performed by a simple FOIL-like algorithm, and propose alternative scoring functions for guiding the search process. We present an empirical evaluation on several data sets in the single-task as well as in the multi-task setting.
translated by 谷歌翻译
Supervised machine learning is the search for algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances. In other words, the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various supervised machine learning classification techniques. Of course, a single article cannot be a complete review of all supervised machine learning classification algorithms (also known induction classification algorithms), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored. Povzetek: Podan je pregled metod strojnega učenja.
translated by 谷歌翻译
Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. Moreover, ML can and occasionally does use ASR as a large-scale, realistic application to rigorously test the effectiveness of a given technique, and to inspire new problems arising from the inherently sequential and dynamic nature of speech. On the other hand, even though ASR is available commercially for some applications, it is largely an unsolved problem-for almost all applications, the performance of ASR is not on par with human performance. New insight from modern ML methodology shows great promise to advance the state-of-the-art in ASR technology. This overview article provides readers with an overview of modern ML techniques as utilized in the current and as relevant to future ASR research and systems. The intent is to foster further cross-pollination between the ML and ASR communities than has occurred in the past. The article is organized according to the major ML paradigms that are either popular already or have potential for making significant contributions to ASR technology. The paradigms presented and elaborated in this overview include: generative and discriminative learning; supervised, unsupervised, semi-supervised, and active learning; adaptive and multi-task learning; and Bayesian learning. These learning paradigms are motivated and discussed in the context of ASR technology and applications. We finally present and analyze recent developments of deep learning and learning with sparse representations, focusing on their direct relevance to advancing ASR technology.
translated by 谷歌翻译