目前,自然语言理解(NLU)中最根本的两个挑战是:(a)如何以“正确”的原因确定基于深度学习的模型是否在NLU基准上得分很高;(b)了解这些原因甚至是什么。我们研究了关于两个语言“技能”的阅读理解模型的行为:核心分辨率和比较。我们为从系统中预期的推理步骤提出了一个定义,该系统将“缓慢阅读”,并将其与各种大小的贝特家族的五个模型的行为进行比较,这是通过显着分数和反事实解释观察到的。我们发现,对于比较(而不是核心),基于较大编码器的系统更有可能依靠“正确”的信息,但即使他们在概括方面也很难,表明他们仍然学习特定的词汇模式,而不是比较的一般原则。
translated by 谷歌翻译
已经做出了许多努力,试图理解什么语法知识(例如,理解代币的语音部分的能力)是在大型预训练的语言模型(LM)中编码的。这是通过“边缘探测”(EP)测试完成的:监督分类任务,以预测SPAN的语法属性(是否具有语音的特定部分)仅使用来自LM编码器的令牌表示。但是,大多数NLP应用程序对这些LM编码器进行了微调,以用于特定任务。在这里,我们问:如果通过EP测试来衡量,LM是否进行了微调,它的语言信息的编码会改变吗?具体来说,我们专注于回答(QA)的任务,并在多个数据集上进行实验。我们发现,当微调模型表现良好或在模型被迫学习错误的相关性的对抗情况下,EP测试结果不会发生显着变化。从类似的发现来看,最近的一些论文得出结论,微调不会改变编码器中的语言知识,但它们没有提供解释。我们发现,EP模型本身容易利用EP数据集中的虚假相关性。当纠正该数据集偏差时,我们确实会看到EP测试结果的改善。
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
The demonstrated success of transfer learning has popularized approaches that involve pretraining models from massive data sources and subsequent finetuning towards a specific task. While such approaches have become the norm in fields such as natural language processing, implementation and evaluation of transfer learning approaches for chemistry are in the early stages. In this work, we demonstrate finetuning for downstream tasks on a graph neural network (GNN) trained over a molecular database containing 2.7 million water clusters. The use of Graphcore IPUs as an AI accelerator for training molecular GNNs reduces training time from a reported 2.7 days on 0.5M clusters to 1.2 hours on 2.7M clusters. Finetuning the pretrained model for downstream tasks of molecular dynamics and transfer to a different potential energy surface took only 8.3 hours and 28 minutes, respectively, on a single GPU.
translated by 谷歌翻译
By utilizing only depth information, the paper introduces a novel but efficient local planning approach that enhances not only computational efficiency but also planning performances for memoryless local planners. The sampling is first proposed to be based on the depth data which can identify and eliminate a specific type of in-collision trajectories in the sampled motion primitive library. More specifically, all the obscured primitives' endpoints are found through querying the depth values and excluded from the sampled set, which can significantly reduce the computational workload required in collision checking. On the other hand, we furthermore propose a steering mechanism also based on the depth information to effectively prevent an autonomous vehicle from getting stuck when facing a large convex obstacle, providing a higher level of autonomy for a planning system. Our steering technique is theoretically proved to be complete in scenarios of convex obstacles. To evaluate effectiveness of the proposed DEpth based both Sampling and Steering (DESS) methods, we implemented them in the synthetic environments where a quadrotor was simulated flying through a cluttered region with multiple size-different obstacles. The obtained results demonstrate that the proposed approach can considerably decrease computing time in local planners, where more trajectories can be evaluated while the best path with much lower cost can be found. More importantly, the success rates calculated by the fact that the robot successfully navigated to the destinations in different testing scenarios are always higher than 99.6% on average.
translated by 谷歌翻译
Leveraging shared learning through Massively Multilingual Models, state-of-the-art machine translation models are often able to adapt to the paucity of data for low-resource languages. However, this performance comes at the cost of significantly bloated models which are not practically deployable. Knowledge Distillation is one popular technique to develop competitive, lightweight models: In this work, we first evaluate its use to compress MT models focusing on languages with extremely limited training data. Through our analysis across 8 languages, we find that the variance in the performance of the distilled models due to their dependence on priors including the amount of synthetic data used for distillation, the student architecture, training hyperparameters and confidence of the teacher models, makes distillation a brittle compression mechanism. To mitigate this, we explore the use of post-training quantization for the compression of these models. Here, we find that while distillation provides gains across some low-resource languages, quantization provides more consistent performance trends for the entire range of languages, especially the lowest-resource languages in our target set.
translated by 谷歌翻译
We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of-the-art models. We highlight commonalities between top approaches to the challenges and identify potential future directions for Embodied AI research.
translated by 谷歌翻译
为了简化图书馆管理的过程,已经采用了许多技术,但其中大多数专注于库存管理。在发行和返回图书馆的发行和返回图书馆的领域,几乎没有任何自动化进展。在大学和学校中,宿舍经常忘记及时将发行的书籍返回图书馆。为了解决上述问题并确保及时提交已发行的书籍,这项工作开发了一个解决这些复杂性的书籍机器人。该机器人可以从A点到B点通勤,扫描并验证QR码和条形码。该机器人将具有一定的有效载荷能力来携带书籍。 QR码和条形码扫描将由PI摄像头,OpenCV和Raspberry Pi启用,从而使书籍交换安全。机器人的探测器操作将通过Blynk应用程序手动控制。本文着重于如何减少人类干预,并在机器人的帮助下自动化图书馆管理系统的问题。
translated by 谷歌翻译
由于数据不平衡和有限,半监督的医学图像分割方法通常无法为某些特定的尾部类别产生卓越的性能。对这些特定课程的培训不足可能会引入更多的噪音,从而影响整体学习。为了减轻这一缺点并确定表现不佳的课程,我们建议保持一个信心阵列,以记录培训期间的班级表现。提出了这些置信分数的模糊融合,以适应每个样本中的个人置信度指标,而不是传统的合奏方法,其中为所有测试案例分配了一组预定义的固定权重。此外,我们引入了一种强大的班级抽样方法和动态稳定,以获得更好的训练策略。我们提出的方法考虑了所有表现不佳的班级,并具有动态权重,并试图在训练过程中消除大多数噪音。通过对两个心脏MRI数据集进行评估,ACDC和MMWHS,我们提出的方法显示出有效性和概括性,并且优于文献中发现的几种最先进的方法。
translated by 谷歌翻译
自然语言推理(NLI)任务通常需要通过多个步骤进行推理才能得出结论。尽管产生此类中间步骤的必要性(而不是摘要说明)获得了大众支持,但尚不清楚如何在不完全端到端的监督以及如何进一步利用此类步骤的情况下生成此类步骤。在这项工作中,我们训练一个序列到序列模型,仅生成下一步给定NLI前提和假设对(以及先前的步骤);然后通过外部知识和符号搜索来增强它,以仅在下一步监督下生成中间步骤。我们通过自动化和人类验证显示了此类生成的步骤的正确性。此外,我们表明,这种生成的步骤可以通过多个公共NLI数据集使用简单的数据增强策略来帮助提高端到端的NLI任务性能。
translated by 谷歌翻译