智能论文笔记

Using Twitter Data to Understand Public Perceptions of Approved versus Off-label Use for COVID-19-related Medications

Yining Hua , Hang Jiang , Shixu Lin , Jie Yang , Joseph M. Plasek , David W. Bates , Li Zhou

分类：自然语言处理 | 机器学习

2022-06-29

了解公众关于紧急使用未经证实的治疗剂的论述对于监视安全使用和打击错误信息至关重要。我们开发了一种基于自然语言处理（NLP）的管道，以了解公众对COVID-19与19与COVID相关药物的立场的看法。这项回顾性研究包括2020年1月29日，2020年至2021年11月30日之间的609,189个基于美国的推文，涉及四种药物，这些药物在19日期期间在流行期间引起了广泛关注：1）羟基氯喹和伊维菌素，毒品疗法，具有轶事证据； 2）Molnupiravir和Remdesivir，适合合格患者的FDA批准的治疗选择。时间趋势分析用于了解受欢迎程度和相关事件。进行了内容和人口统计分析，以探讨人们对每种药物的立场的潜在理由。时间趋势分析表明，羟氯喹和伊维菌素的讨论比Molnupiravir和Remdesivir更多，尤其是在Covid-19-19潮中期。羟氯喹和伊维菌素高度政治化，与阴谋论，传闻，名人效应等有关。美国两个主要政党之间立场的分布大不相同（p <0.001）；共和党人比民主党人更有可能支持羟氯喹（+55％）和伊维菌素（+30％）。具有医疗保健背景的人倾向于比普通人群多反对羟氯喹（+7％）。相比之下，普通人群更有可能支持伊维菌素（+14％）。我们在https://github.com/ningkko/covid-drug上提供所有数据，代码和模型。

translated by 谷歌翻译

Conservation Tools: The Next Generation of Engineering--Biology Collaborations

Andrew Schulz , Cassie Shriver , Suzanne Stathatos , Benjamin Seleb , Emily Weigel , Young-Hui Chang , M. Saad Bhamla , David Hu , Joseph R. Mendelson III , .

分类：机器学习

2023-01-03

The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.

translated by 谷歌翻译

Demonstration of machine-learning-enhanced Bayesian quantum state estimation

Sanjaya Lohani , Joseph M. Lukens , Atiyya A. Davis , Amirali Khannejad , Sangita Regmi , Daniel E. Jones , Ryan T. Glasser , Thomas A. Searles , Brian T. Kirby

分类：机器学习

2022-12-15

Machine learning (ML) has found broad applicability in quantum information science in topics as diverse as experimental design, state classification, and even studies on quantum foundations. Here, we experimentally realize an approach for defining custom prior distributions that are automatically tuned using ML for use with Bayesian quantum state estimation methods. Previously, researchers have looked to Bayesian quantum state tomography due to its unique advantages like natural uncertainty quantification, the return of reliable estimates under any measurement condition, and minimal mean-squared error. However, practical challenges related to long computation times and conceptual issues concerning how to incorporate prior knowledge most suitably can overshadow these benefits. Using both simulated and experimental measurement results, we demonstrate that ML-defined prior distributions reduce net convergence times and provide a natural way to incorporate both implicit and explicit information directly into the prior distribution. These results constitute a promising path toward practical implementations of Bayesian quantum state tomography.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Scope of Pre-trained Language Models for Detecting Conflicting Health Information

Joseph Gatto , Madhusudan Basak , Sarah M. Preum

分类：自然语言处理

2022-09-22

现在，越来越多的人依靠在线平台来满足其健康信息需求。因此，确定不一致或矛盾的文本健康信息已成为一项关键的任务。健康建议数据提出了一个独特的挑战，在一个诊断的背景下，在另一个诊断的背景下是准确的信息。例如，患有糖尿病和高血压的人通常会在饮食方面得到矛盾的健康建议。这激发了对可以提供上下文化的，特定于用户的健康建议的技术的需求。朝着情境化建议迈出的关键一步是能够比较健康建议陈述并检测它们是否以及如何冲突的能力。这是健康冲突检测（HCD）的任务。鉴于两个健康建议，HCD的目标是检测和分类冲突的类型。这是一项具有挑战性的任务，因为（i）自动识别和分类冲突需要更深入地了解文本的语义，并且（ii）可用数据的数量非常有限。在这项研究中，我们是第一个在预先训练的语言模型的背景下探索HCD的人。我们发现，Deberta-V3在所有实验中的平均F1得分为0.68。我们还研究了不同冲突类型所带来的挑战，以及合成数据如何改善模型对冲突特定语义的理解。最后，我们强调了收集实际健康冲突的困难，并提出了一种人类的合成数据增强方法来扩展现有的HCD数据集。我们的HCD培训数据集比现有的HCD数据集大2倍以上，并在GitHub上公开可用。

translated by 谷歌翻译

Operationalizing Machine Learning: An Interview Study

Shreya Shankar , Rolando Garcia , Joseph M. Hellerstein , Aditya G. Parameswaran

分类：机器学习

2022-09-16

组织依靠机器学习工程师（MLE）来操作ML，即部署和维护生产中的ML管道。操作ML或MLOP的过程包括（i）数据收集和标记的连续循环，（ii）实验以改善ML性能，（iii）在多阶段部署过程中评估，以及（iv）监视（iv）性能下降。当一起考虑这些责任似乎令人震惊 - 任何人如何进行MLOP，没有解决的挑战，对工具制造商有什么影响？我们对在包括聊天机器人，自动驾驶汽车和金融在内的许多应用程序中工作的18个MLE进行了半结构化的民族志访谈。我们的访谈暴露了三个变量，这些变量控制了生产ML部署的成功：速度，验证和版本。我们总结了成功实验，部署和维持生产绩效的共同实践。最后，我们讨论了受访者的痛点和反图案，对工具设计产生了影响。

translated by 谷歌翻译

Adaptive Complexity Model Predictive Control

Joseph Norby , Ardalan Tajbakhsh , Yanhao Yang , Aaron M. Johnson

分类：机器人

2022-09-06

这项工作介绍了模型预测控制（MPC）的公式，该公式适应基于任务的模型的复杂性，同时保持可行性和稳定性保证。现有的MPC实现通常通过缩短预测范围或简化模型来处理计算复杂性，这两者都可能导致不稳定。受到行为经济学，运动计划和生物力学相关方法的启发，我们的方法通过简单模型解决了MPC问题，用于在地平线区域的动力学和约束，而这种模型是可行的，并且不存在该模型的复杂模型。该方法利用计划和执行的交织来迭代识别这些区域，如果它们满足确切的模板/锚关系，可以安全地简化这些区域。我们表明，该方法不会损害系统的稳定性和可行性特性，并在仿真实验中衡量在四足动物上执行敏捷行为的仿真实验中的性能。我们发现，与固定复杂性实现相比，这种自适应方法可以实现更多的敏捷运动，并扩大可执行任务的范围。

translated by 谷歌翻译

Fast Kernel Density Estimation with Density Matrices and Random Fourier Features

Joseph A. Gallego M. , Juan F. Osorio , Fabio A. González

分类：机器学习

2022-08-02

核密度估计（KDE）是使用最广泛的非参数密度估计方法之一。它是一种基于内存的方法，即它将整个培训数据集用于预测，这使其不适合大多数当前的大数据应用程序。已经提出了几种策略，例如基于树或基于哈希的估计器，以提高内核密度估计方法的效率。新型密度内核密度估计方法（DMKDE）使用密度矩阵，量子机械形式主义和随机傅立叶特征（显式内核近似）来产生密度估计。该方法的根源在KDE中，可以被视为近似方法，而无需基于内存的限制。在本文中，我们系统地评估了新型DMKDE算法，并将其与其他最新的快速程序进行比较，以近似于不同合成数据集的内核密度估计方法。我们的实验结果表明，在高维数据上执行时，显示了DMKDE与其竞争对手的计算密度估计和优势相提并论。我们将所有代码作为开源软件存储库提供。

translated by 谷歌翻译

Quantum Adaptive Fourier Features for Neural Density Estimation

Joseph A. Gallego M. , Fabio A. González

分类：机器学习 | (统计)机器学习

2022-08-01

密度估计是统计和机器学习应用中的基本任务。内核密度估计是低维度非参数密度估计的强大工具；但是，其性能在更高的维度上很差。此外，其预测复杂性量表与更多的培训数据点线性线性。本文提出了一种神经密度估计的方法，可以看作是一种核密度估计的一种，但没有高预测计算复杂性。该方法基于密度矩阵，一种用于量子力学的形式主义和自适应傅立叶特征。可以在没有优化的情况下对该方法进行培训，但也可以与深度学习体系结构集成并使用梯度下降进行训练。因此，它可以看作是神经密度估计方法的一种形式。该方法在不同的合成和实际数据集中进行了评估，其性能与最新的神经密度估计方法进行了比较，从而获得了竞争结果。

translated by 谷歌翻译

An autonomous robot for pruning modern, planar fruit trees

Alexander You , Nidhi Parayil , Josyula Gopala Krishna , Uddhav Bhattarai , Ranjan Sapkota , Dawood Ahmed , Matthew Whiting , Manoj Karkee , Cindy M. Grimm , Joseph R. Davidson

分类：机器人

2022-06-14

果树的休眠修剪是维持树木健康和确保高质量果实的重要任务。由于劳动力的可用性降低，修剪是机器人自动化的主要候选者。但是，修剪也代表了机器人的独特困难问题，需要在可变照明条件下以及在复杂的，高度非结构化的环境中进行感知，修剪点的确定和操纵。在本文中，我们介绍了一种用于修剪甜樱桃树的系统（在平面树建筑中，称为直立的果实分支配置），该系统整合了我们先前关于感知和操纵的工作的各种子系统。最终的系统能够完全自主运行，并且需要对环境的最低控制。我们通过在甜蜜的樱桃果园中进行现场试验来验证系统的性能，最终取得了58％的削减速度。尽管不完全稳健，并且需要改善吞吐量，但我们的系统是第一个在果树上运行的系统，并代表了将来可以改进的有用的基础平台。

translated by 谷歌翻译