智能论文笔记

SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials

Peter Eastman , Pavan Kumar Behara , David L. Dotson , Raimondas Galvelis , John E. Herr , Josh T. Horton , Yuezhi Mao , John D. Chodera , Benjamin P. Pritchard , Yuanqing Wang

分类：机器学习

2022-09-21

机器学习潜力是分子模拟的重要工具，但是由于缺乏高质量数据集来训练它们的发展，它们的开发阻碍了它们。我们描述了Spice数据集，这是一种新的量子化学数据集，用于训练与模拟与蛋白质相互作用的药物样的小分子相关的潜在。它包含超过110万个小分子，二聚体，二肽和溶剂化氨基酸的构象。它包括15个元素，带电和未充电的分子以及广泛的共价和非共价相互作用。它提供了在{\ omega} b97m-d3（bj）/def2-tzVPPD理论水平以及其他有用的数量（例如多极矩和键阶）上计算出的力和能量。我们在其上训练一组机器学习潜力，并证明它们可以在化学空间的广泛区域中实现化学精度。它可以作为创建可转移的，准备使用潜在功能用于分子模拟的宝贵资源。

translated by 谷歌翻译

Settling the Reward Hypothesis

Michael Bowling , John D. Martin , David Abel , Will Dabney

分类：人工智能 | 机器学习

2022-12-20

The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)." We aim to fully settle this hypothesis. This will not conclude with a simple affirmation or refutation, but rather specify completely the implicit requirements on goals and purposes under which the hypothesis holds.

translated by 谷歌翻译

SMSMix: Sense-Maintained Sentence Mixup for Word Sense Disambiguation

Hee Suk Yoon , Eunseop Yoon , John Harvill , Sunjae Yoon , Mark Hasegawa-Johnson , Chang D. Yoo

分类：自然语言处理 | 机器学习

2022-12-14

Word Sense Disambiguation (WSD) is an NLP task aimed at determining the correct sense of a word in a sentence from discrete sense choices. Although current systems have attained unprecedented performances for such tasks, the nonuniform distribution of word senses during training generally results in systems performing poorly on rare senses. To this end, we consider data augmentation to increase the frequency of these least frequent senses (LFS) to reduce the distributional bias of senses during training. We propose Sense-Maintained Sentence Mixup (SMSMix), a novel word-level mixup method that maintains the sense of a target word. SMSMix smoothly blends two sentences using mask prediction while preserving the relevant span determined by saliency scores to maintain a specific word's sense. To the best of our knowledge, this is the first attempt to apply mixup in NLP while preserving the meaning of a specific word. With extensive experiments, we validate that our augmentation method can effectively give more information about rare senses during training with maintained target sense label.

translated by 谷歌翻译

Issues and Challenges in Applications of Artificial Intelligence to Nuclear Medicine -- The Bethesda Report (AI Summit 2022)

Arman Rahmim , Tyler J. Bradshaw , Irène Buvat , Joyita Dutta , Abhinav K. Jha , Paul E. Kinahan , Quanzheng Li , Chi Liu , Melissa D. McCradden , Babak Saboury

分类：人工智能

2022-11-07

The SNMMI Artificial Intelligence (SNMMI-AI) Summit, organized by the SNMMI AI Task Force, took place in Bethesda, MD on March 21-22, 2022. It brought together various community members and stakeholders from academia, healthcare, industry, patient representatives, and government (NIH, FDA), and considered various key themes to envision and facilitate a bright future for routine, trustworthy use of AI in nuclear medicine. In what follows, essential issues, challenges, controversies and findings emphasized in the meeting are summarized.

translated by 谷歌翻译

An Algebraic Framework for Stock & Flow Diagrams and Dynamical Systems Using Category Theory

John C. Baez , Xiaoyan Li , Sophie Libkind , Nathaniel D. Osgood , Eric Redekopp

分类：自然语言处理

2022-11-01

Stock and flow diagrams are already an important tool in epidemiology, but category theory lets us go further and treat these diagrams as mathematical entities in their own right. In this chapter we use communicable disease models created with our software, StockFlow.jl, to explain the benefits of the categorical approach. We first explain the category of stock-flow diagrams, and note the clear separation between the syntax of these diagrams and their semantics, demonstrating three examples of semantics already implemented in the software: ODEs, causal loop diagrams, and system structure diagrams. We then turn to two methods for building large stock-flow diagrams from smaller ones in a modular fashion: composition and stratification. Finally, we introduce the open-source ModelCollab software for diagram-based collaborative modeling. The graphical user interface of this web-based software lets modelers take advantage of the ideas discussed here without any knowledge of their categorical foundations.

translated by 谷歌翻译

Domain-Specific Text Generation for Machine Translation

Yasmin Moslem , Rejwanul Haque , John D. Kelleher , Andy Way

分类：自然语言处理

2022-08-11

在任何翻译工作流程中，从源到目标的域知识保存至关重要。在翻译行业中，接收高度专业化的项目是很常见的，那里几乎没有任何平行的内域数据。在这种情况下，没有足够的内域数据来微调机器翻译（MT）模型，生成与相关上下文一致的翻译很具有挑战性。在这项工作中，我们提出了一种新颖的方法，用于域适应性，以利用最新的审计语言模型（LMS）来用于特定于域的MT的域数据增强，并模拟（a）的（a）小型双语数据集的域特征，或（b）要翻译的单语源文本。将这个想法与反翻译相结合，我们可以为两种用例生成大量的合成双语内域数据。为了进行调查，我们使用最先进的变压器体系结构。我们采用混合的微调来训练模型，从而显着改善了内域文本的翻译。更具体地说，在这两种情况下，我们提出的方法分别在阿拉伯语到英语对阿拉伯语言对上分别提高了大约5-6个BLEU和2-3 BLEU。此外，人类评估的结果证实了自动评估结果。

translated by 谷歌翻译

Deep Learning-Based Objective and Reproducible Osteosarcoma Chemotherapy Response Assessment and Outcome Prediction

David Joon Ho , Narasimhan P. Agaram , Marc-Henri Jean , Stephanie D. Suser , Cynthia Chu , Chad M. Vanderbilt , Paul A. Meyers , Leonard H. Wexler , John H. Healey , Thomas J. Fuchs

分类：计算机视觉

2022-08-09

骨肉瘤是最常见的原发性骨癌，其标准治疗包括术前化疗，然后切除。化学疗法反应用于预测患者的预后和进一步治疗。坏死在切除标本上的组织学幻灯片通常评估了坏死比定义为坏死肿瘤与总体肿瘤之比。已知坏死比> = 90％的患者的预后更好。多个载玻片对坏死比的手动微观综述是半定量性的，并且可能具有观察者间和观察者间的变异性。我们提出了一种基于目标和可再现的深度学习方法，以估计坏死比，并从扫描的苏木精和曙红全幻灯片图像预测结果。我们以3134个WSI的速度收集了103例骨肉瘤病例，以训练我们的深度学习模型，验证坏死比评估并评估结果预测。我们训练了深层多磁化网络，以分割多个组织亚型，包括生存的肿瘤和像素级中的坏死肿瘤，并计算来自多个WSI的病例级坏死比。我们显示了通过分割模型估算的坏死比，高度与由专家手动评估的病理报告中的坏死比高度相关，其中IV级的平均绝对差异（100％），III（> = 90％）和II（> = 50％和<50％和< 90％）坏死反应分别为4.4％，4.5％和17.8％。我们成功地对患者进行了分层，以预测P = 10^-6的总生存率，而P = 0.012的无进展生存率。我们没有可变性的可重现方法使我们能够调整截止阈值，特别是用于模型和数据集的截止阈值，为OS的80％，PFS为60％。我们的研究表明，深度学习可以支持病理学家作为一种客观的工具，可以分析组织学中骨肉瘤，以评估治疗反应并预测患者结果。

translated by 谷歌翻译

GLEAM: Greedy Learning for Large-Scale Accelerated MRI Reconstruction

Batu Ozturkler , Arda Sahiner , Tolga Ergen , Arjun D Desai , Christopher M Sandino , Shreyas Vasanawala , John M Pauly , Morteza Mardani , Mert Pilanci

分类：计算机视觉

2022-07-18

展开的神经网络最近实现了最先进的MRI重建。这些网络通过在基于物理的一致性和基于神经网络的正则化之间交替来展开迭代优化算法。但是，它们需要大型神经网络的几次迭代来处理高维成像任务，例如3D MRI。这限制了基于反向传播的传统训练算法，这是由于较大的记忆力和计算梯度和存储中间激活的计算要求。为了应对这一挑战，我们提出了加速MRI（GLEAM）重建的贪婪学习，这是一种高维成像设置的有效培训策略。 GLEAM将端到端网络拆分为脱钩的网络模块。每个模块都以贪婪的方式优化，并通过脱钩的梯度更新，从而减少了训练过程中的内存足迹。我们表明，可以在多个图形处理单元（GPU）上并行执行解耦梯度更新，以进一步减少训练时间。我们介绍了2D和3D数据集的实验，包括多线圈膝，大脑和动态心脏Cine MRI。我们观察到：i）闪闪发光的概括以及最先进的记忆效率基线，例如具有相同内存足迹的梯度检查点和可逆网络，但训练速度更快1.3倍； ii）对于相同的内存足迹，闪光在2D中产生1.1dB PSNR的增益，而3D在端到端基线中产生1.8 dB。

translated by 谷歌翻译

An Approach to Ensure Fairness in News Articles

Shaina Raza , Deepak John Reji , Dora D. Liu , Syed Raza Bashir , Usman Naseem

分类：自然语言处理

2022-07-08

推荐系统，信息检索和其他信息访问系统提出了在非结构化文本中检查和应用公平和偏见缓解概念的独特挑战。本文介绍了DBIAS，这是一个Python包，可确保新闻文章的公平性。DBIAS是一种受过训练的机器学习（ML）管道，可以使用文本（例如，段落或新闻故事），并检测文本是否有偏见。然后，它检测到文本中的有偏见的单词，掩盖它们，并推荐一组带有新单词的句子，这些句子是无偏见或至少偏见的句子。我们结合了数据科学最佳实践的要素，以确保该管道可再现和可用。我们在实验中表明，该管道可以有效缓解偏见，并优于确保新闻文章公平性的常见神经网络体系结构。

translated by 谷歌翻译

Interpretable Hidden Markov Model-Based Deep Reinforcement Learning Hierarchical Framework for Predictive Maintenance of Turbofan Engines

Ammar N. Abbas , Georgios Chasparis , John D. Kelleher

分类：机器学习 | 人工智能

2022-06-27

深度强化学习中的一个开放研究问题是如何将稀疏领域中关键决策的政策学习重点。本文强调将“隐藏的马尔可夫模型”和“强化学习”的优势结合在一起，以朝着可解释的维护决策中。我们提出了一种新型的层次建模方法，该方法在高水平上检测并解释了失败的根本原因以及涡轮扇叶引擎的健康降解，而在低水平上，它提供了最佳的替换政策。它的表现优于直接应用于原始数据或使用隐藏的马尔可夫模型而没有这样的专业层次结构时，深入强化学习方法的基线性能。但是，它还提供了与先前的工作相当的绩效，并具有可解释性的额外好处。

translated by 谷歌翻译