In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system. KAZU framework is open-sourced: https://github.com/AstraZeneca/KAZU
translated by 谷歌翻译
Very few eXplainable AI (XAI) studies consider how users understanding of explanations might change depending on whether they know more or less about the to be explained domain (i.e., whether they differ in their expertise). Yet, expertise is a critical facet of most high stakes, human decision making (e.g., understanding how a trainee doctor differs from an experienced consultant). Accordingly, this paper reports a novel, user study (N=96) on how peoples expertise in a domain affects their understanding of post-hoc explanations by example for a deep-learning, black box classifier. The results show that peoples understanding of explanations for correct and incorrect classifications changes dramatically, on several dimensions (e.g., response times, perceptions of correctness and helpfulness), when the image-based domain considered is familiar (i.e., MNIST) as opposed to unfamiliar (i.e., Kannada MNIST). The wider implications of these new findings for XAI strategies are discussed.
translated by 谷歌翻译
We present the Group Propagation Vision Transformer (GPViT): a novel nonhierarchical (i.e. non-pyramidal) transformer model designed for general visual recognition with high-resolution features. High-resolution features (or tokens) are a natural fit for tasks that involve perceiving fine-grained details such as detection and segmentation, but exchanging global information between these features is expensive in memory and computation because of the way self-attention scales. We provide a highly efficient alternative Group Propagation Block (GP Block) to exchange global information. In each GP Block, features are first grouped together by a fixed number of learnable group tokens; we then perform Group Propagation where global information is exchanged between the grouped features; finally, global information in the updated grouped features is returned back to the image features through a transformer decoder. We evaluate GPViT on a variety of visual recognition tasks including image classification, semantic segmentation, object detection, and instance segmentation. Our method achieves significant performance gains over previous works across all tasks, especially on tasks that require high-resolution outputs, for example, our GPViT-L3 outperforms Swin Transformer-B by 2.0 mIoU on ADE20K semantic segmentation with only half as many parameters. Code and pre-trained models are available at https://github.com/ChenhongyiYang/GPViT .
translated by 谷歌翻译
We study the classic facility location setting, where we are given $n$ clients and $m$ possible facility locations in some arbitrary metric space, and want to choose a location to build a facility. The exact same setting also arises in spatial social choice, where voters are the clients and the goal is to choose a candidate or outcome, with the distance from a voter to an outcome representing the cost of this outcome for the voter (e.g., based on their ideological differences). Unlike most previous work, we do not focus on a single objective to optimize (e.g., the total distance from clients to the facility, or the maximum distance, etc.), but instead attempt to optimize several different objectives simultaneously. More specifically, we consider the $l$-centrum family of objectives, which includes the total distance, max distance, and many others. We present tight bounds on how well any pair of such objectives (e.g., max and sum) can be simultaneously approximated compared to their optimum outcomes. In particular, we show that for any such pair of objectives, it is always possible to choose an outcome which simultaneously approximates both objectives within a factor of $1+\sqrt{2}$, and give a precise characterization of how this factor improves as the two objectives being optimized become more similar. For $q>2$ different centrum objectives, we show that it is always possible to approximate all $q$ of these objectives within a small constant, and that this constant approaches 3 as $q\rightarrow \infty$. Our results show that when optimizing only a few simultaneous objectives, it is always possible to form an outcome which is a significantly better than 3 approximation for all of these objectives.
translated by 谷歌翻译
成功的材料选择对于设计和制造产品的设计自动化至关重要。设计师通过通过性能,制造性和可持续性评估选择最合适的材料来利用他们的知识和经验来创建高质量的设计。智能工具可以通过提供从先前的设计中学到的建议来帮助具有不同专业知识的设计师。为了实现这一目标,我们介绍了一个图表表示学习框架,该框架支持组装中身体的物质预测。我们将材料选择任务作为节点级预测任务,对CAD模型的汇编图表示,并使用图形神经网络(GNN)对其进行处理。在Fusion 360画廊数据集上执行的三个实验协议的评估表明我们的方法的可行性,达到了0.75 TOP-3 Micro-F1分数。提出的框架可以扩展到大型数据集,并将设计师的知识纳入学习过程。这些功能使该框架可以作为设计自动化的推荐系统以及未来工作的基准,从而缩小了人类设计师与智能设计代理之间的差距。
translated by 谷歌翻译
随着LIDAR传感器无处不在,对LiDAR数据压缩算法的需求增加了。现代激光痛每小时会产生千兆字节的扫描数据,并且经常用于有限的计算,带宽和存储资源的应用中。我们为激光雷达范围和属性扫描序列提供了一种快速,无损的压缩算法,包括多回报范围,信号,反射率和环境红外。我们的算法(称为“ Jiffy”)通过利用时空冗余性和稀疏性来实现实质性压缩。速度是通过最大程度地利用单个指令(SIMD)指令来实现的。在自动驾驶,基础架构监控,无人机检查和手持式映射基准测试中,吉菲算法始终以单个核心的速度超过65m/sec的速度运行,始终胜过竞争的无损编解码器。在典型的自动驾驶用例中,单线程JIFFY以每秒500多次扫描以6厘米精确范围的扫描达到6倍压缩。为了确保可重复性并启用采用,该软件可以作为开源库免费提供。
translated by 谷歌翻译
Systems Biology试图创建生物系统的数学模型,以减少固有的生物学复杂性,并为治疗性开发等应用提供预测。但是,确定哪种数学模型正确以及如何最佳地到达答案仍然是一个挑战。我们提出了一种使用系统生物学和可能性无推理方法的数学模型选择自动生物学模型选择的算法。我们的算法显示,在实验生物学和随机搜索中使用的常规启发式方法的先验信息中,在正确的模型中表现出了改善的性能。该方法显示有望加速生物基础科学和药物发现。
translated by 谷歌翻译
可以与其他代理人互动以完成给定任务的自主代理的发展是人工智能和机器学习研究的核心领域。为了实现这一目标,自主代理研究小组开发了用于自主系统控制的新型机器学习算法,特别关注深度强化学习和多代理强化学习。研究问题包括可扩展的协调代理政策和代理间沟通;从有限观察的情况下对其他代理的行为,目标和组成的推理;以及基于内在动机,课程学习,因果推断和代表性学习的样品学习。本文概述了该小组正在进行的研究组合,并讨论了未来方向的开放问题。
translated by 谷歌翻译
临时团队合作(AHT)是创建一个必须与以前看不见的队友合作而没有事先协调的问题。许多现有的AHT方法可以归类为基于类型的方法,这些方法需要一组预定义的队友进行培训。为训练设计队友类型是一个具有挑战性的问题,它决定了在训练期间与队友类型打交道时的代理商的概括性能。在这项工作中,我们提出了一种基于最大化最佳响应多样性指标的不同队友类型的方法。我们表明,我们提出的方法会产生队友类型,这些类型需要在协作期间从学习者那里获得更广泛的最佳反应,这可能会提高学习者在AHT中的稳健性与替代方法相比。
translated by 谷歌翻译
我们提出了小说的少量团队合作(FST)问题,在该问题中,在团队中训练有素的熟练代理人完成一项任务与来自不同任务的熟练代理相结合,并且必须共同学习适应一个看不见但相关的任务。我们讨论如何将FST问题视为解决两个单独的问题:一种减少培训代理团队完成复杂任务所需的经验;与陌生队友合作完成了一项新任务。解决FST的进展可能会导致多方面的强化学习和临时团队合作的进步。
translated by 谷歌翻译