智能论文笔记

ProbNum: Probabilistic Numerics in Python

Jonathan Wenger , Nicholas Krämer , Marvin Pförtner , Jonathan Schmidt , Nathanael Bosch , Nina Effenberger , Johannes Zenn , Alexandra Gessner , Toni Karvonen , François-Xavier Briol

分类：机器学习

2021-12-03

概率数值方法（PNMS）通过概率推断解决数值问题。它们已开发用于线性代数，优化，集成和微分方程模拟。PNMS自然地纳入了关于问题的先前信息，并通过有限计算资源以及随机输入来量化不确定性。在本文中，我们提出了probnum：提供最先进的概率数值求解器的Python库。Probnum通过模块化设计以及包装器，可以通过模块化设计来定制PNMS的定制组成，以供自卸使用。在线，在线，文档，开发人员指南和基准，请访问www.probnum.org。

translated by 谷歌翻译

POT: Python Optimal Transport.

分类：

Optimal transport has recently been reintroduced to the machine learning community thanks in part to novel efficient optimization procedures allowing for medium to large scale applications. We propose a Python toolbox that implements several key optimal transport ideas for the machine learning community. The toolbox contains implementations of a number of founding works of OT for machine learning such as Sinkhorn algorithm and Wasserstein barycenters, but also provides generic solvers that can be used for conducting novel fundamental research. This toolbox, named POT for Python Optimal Transport, is open source with an MIT license.

translated by 谷歌翻译

NetKet 3: Machine Learning Toolbox for Many-Body Quantum Systems

Filippo Vicentini , Damian Hofmann , Attila Szabó , Dian Wu , Christopher Roth , Clemens Giuliani , Gabriel Pescia , Jannes Nys , Vladimir Vargas-Calderon , Nikita Astrakhantsev

分类：机器学习

2021-12-20

我们介绍了Netket的版本3，机器学习工具箱适用于许多身体量子物理学。Netket围绕神经网络量子状态构建，并为其评估和优化提供有效的算法。这个新版本是基于JAX的顶部，一个用于Python编程语言的可差分编程和加速的线性代数框架。最重要的新功能是使用机器学习框架的简明符号来定义纯Python代码中的任意神经网络ANS \“凝固的可能性，这允许立即编译以及渐变的隐式生成自动化。Netket 3还带来了GPU和TPU加速器的支持，对离散对称组的高级支持，块以缩放多程度的自由度，Quantum动态应用程序的驱动程序，以及改进的模块化，允许用户仅使用部分工具箱是他们自己代码的基础。

translated by 谷歌翻译

Array Programming with NumPy

Charles R. Harris , K. Jarrod Millman , Stéfan J. van der Walt , Ralf Gommers , Pauli Virtanen , David Cournapeau , Eric Wieser , Julian Taylor , Sebastian Berg , Nathaniel J. Smith

分类：

2020-06-18

Array programming provides a powerful, compact, expressive syntax for accessing, manipulating, and operating on data in vectors, matrices, and higher-dimensional arrays [1]. NumPy is the primary array programming library for the Python language [2,3,4,5]. It plays an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, material science, engineering, finance, and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves [6] and the first imaging of a black hole [7].Here we show how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring, and analyzing scientific data. NumPy is the foundation upon which the entire scientific Python universe is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Because of its central position in the ecosystem, NumPy increasingly plays the role of an interoperability layer between these new array computation libraries.

translated by 谷歌翻译

Universal Differential Equations for Scientific Machine Learning

Christopher Rackauckas , Yingbo Ma , Julius Martensen , Collin Warner , Kirill Zubov , Rohit Supekar , Dominic Skinner , Ali Ramadhan , Alan Edelman

分类：机器学习 | (统计)机器学习

2020-01-13

在科学的背景下，众所周知的格言“一张图片胜过千言万语”可能是“一个型号胜过一千个数据集”。在本手稿中，我们将Sciml软件生态系统介绍作为混合物理法律和科学模型的信息，并使用数据驱动的机器学习方法。我们描述了一个数学对象，我们表示通用微分方程（UDE），作为连接生态系统的统一框架。我们展示了各种各样的应用程序，从自动发现解决高维汉密尔顿 - Jacobi-Bellman方程的生物机制，可以通过UDE形式主义和工具进行措辞和有效地处理。我们展示了软件工具的一般性，以处理随机性，延迟和隐式约束。这使得各种SCIML应用程序变为核心训练机构的核心集，这些训练机构高度优化，稳定硬化方程，并与分布式并行性和GPU加速器兼容。

translated by 谷歌翻译

Baihe: SysML Framework for AI-driven Databases

Andreas Pfadler , Rong Zhu , Wei Chen , Botong Huang , Tianjing Zeng , Bolin Ding , Jingren Zhou

分类：人工智能

2021-12-29

我们为AI驱动数据库提供了一个SYSML框架。使用Baihe，可能会改装现有的关系数据库系统以使用学习组件进行查询优化或其他常见任务，例如例如，学习索引结构。为确保Baihe的实用性和现实世界适用性，其高级架构基于以下要求：与核心系统的分离，最小的第三方依赖，鲁棒性，稳定性和容错，以及稳定性和可配置性。基于高级架构，我们将描述Baihe的具体实现PostgreSQL，并为学习查询优化器提供了实例使用情况。为了服务于从业者，以及DB和AI4DB社区的研究人员将在开源许可下发布PostgreSQL的Baihe。

translated by 谷歌翻译

j-Wave: An open-source differentiable wave simulator

Antonio Stanziola , Simon R. Arridge , Ben T. Cox , Bradley E. Treeby

分类：机器学习

2022-06-30

我们提出了一个开源的可区分的声学模拟器J-Wave，可以解决时变和时谐音的声学问题。它支持自动差异化，这是一种具有许多应用程序的程序转换技术，尤其是在机器学习和科学计算中。J-Wave由模块化组件组成，可以轻松定制和重复使用。同时，它与一些最受欢迎的机器学习库（例如JAX和TensorFlow）兼容。对于广泛使用的K-Wave工具箱和一系列声学仿真软件，评估了已知配置的仿真结果的准确性。可从https://github.com/ucl-bug/jwave获得J-Wave。

translated by 谷歌翻译

NeuralFMU: Presenting a workflow for integrating hybrid NeuralODEs into real world applications

Tobias Thummerer , Johannes Stoljar , Lars Mikelsons

分类：机器学习

2022-09-08

Neuralode一词描述了人工神经网络（ANN）和用于普通微分方程（ODES）的数值求解器的结构组合，前者是要解决的ode的右侧。黑盒模型以功能模型单元（FMU）的形式进一步扩展了这一概念，以获得名为NeuralFMUS的神经台阶的子类。最终的结构具有一个单个模拟模型中第一原则和数据驱动建模方法的优势：与常规的第一原理模型（FPM）相比，预测准确性更高，而与纯粹数据驱动的模型相比，培训工作也更低。我们提出了一个直观的工作流程，以设置和使用NeuralFMU，从而可以封装和重用从通用建模工具导出的现有常规模型。此外，我们通过在汽车纵向动力学模型（VLDM）中部署神经FMU来体现这一概念，该模拟是汽车行业中典型的用例。在科学用例中经常忽略的相关挑战，例如实际测量（例如噪声），未知的系统状态或高频不连续性，在此贡献中得到了处理。为了构建比原始FPM更高的预测质量的混合模型，我们简要强调了两个开源库：FMI.JL用于将FMU集成到Julia编程环境中，以及该库的扩展名为FMIFLUX。 JL，这允许将FMU集成到神经网络拓扑中，以最终获得神经FMU。

translated by 谷歌翻译

NeuralUQ: A comprehensive library for uncertainty quantification in neural differential equations and operators

Zongren Zou , Xuhui Meng , Apostolos F Psaros , George Em Karniadakis

分类：机器学习

2022-08-25

机器学习中的不确定性量化（UQ）目前正在引起越来越多的研究兴趣，这是由于深度神经网络在不同领域的快速部署，例如计算机视觉，自然语言处理以及对风险敏感应用程序中可靠的工具的需求。最近，还开发了各种机器学习模型，以解决科学计算领域的问题，并适用于计算科学和工程（CSE）。物理知识的神经网络和深层操作员网络是两个这样的模型，用于求解部分微分方程和学习操作员映射。在这方面，[45]中提供了专门针对科学机器学习（SCIML）模型量身定制的UQ方法的全面研究。然而，尽管具有理论上的优点，但这些方法的实施并不简单，尤其是在大规模的CSE应用程序中，阻碍了他们在研究和行业环境中的广泛采用。在本文中，我们提出了一个开源python图书馆（https://github.com/crunch-uq4mi），称为Neuraluq，并伴有教育教程，用于以方便且结构化的方式采用SCIML的UQ方法。该图书馆既专为教育和研究目的，都支持多种现代UQ方法和SCIML模型。它基于简洁的工作流程，并促进了用户的灵活就业和易于扩展。我们首先提出了神经脉的教程，随后在四个不同的示例中证明了其适用性和效率，涉及动态系统以及高维参数和时间依赖性PDE。

translated by 谷歌翻译

HTML版本

Automatic differentiation in machine learning: a survey

Atilim Gunes Baydin , Barak A. Pearlmutter , Alexey Andreyevich Radul , Jeffrey Mark Siskind

分类：

2015-02-20

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.

translated by 谷歌翻译

A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

Jonathan Schmidt , Nicholas Krämer , Philipp Hennig

分类： (统计)机器学习 | 机器学习

2021-03-18

具有微分方程的机械模型是机器学习科学应用的关键组成部分。这种模型中的推论通常在计算上是要求的，因为它涉及重复求解微分方程。这里的主要问题是数值求解器很难与标准推理技术结合使用。概率数字中的最新工作已经开发了一类新的用于普通微分方程（ODE）的求解器，该方程式直接用贝叶斯过滤词来表达解决方案过程。我们在这里表明，这允许将此类方法与概念和数值易于宽容地结合在一起，并在ODE本身中与潜在力模型结合在一起。然后，可以在潜在力和ode溶液上执行近似贝叶斯推断，并在一个线性复杂度传递中进行扩展的卡尔曼滤波器 /更平滑的线性复杂度，也就是说，以计算单个ODE解决方案为代价。我们通过培训表明了算法的表达和性能，以及其他训练中的非参数SIRD模型。

translated by 谷歌翻译

Physics-informed machine learning

分类：

Despite great progress in simulating multiphysics problems using the numerical discretization of partial differential equations (PDEs), one still cannot seamlessly incorporate noisy data into existing algorithms, mesh generation remains complex, and high-dimensional problems governed by parameterized PDEs cannot be tackled. Moreover, solving inverse problems with hidden physics is often prohibitively expensive and requires different formulations and elaborate computer codes. Machine learning has emerged as a promising alternative, but training deep neural networks requires big data, not always available for scientific problems. Instead, such networks can be trained from additional information obtained by enforcing the physical laws (for example, at random points in the continuous space-time domain). Such physics-informed learning integrates (noisy) data and mathematical models, and implements them through neural networks or other kernel-based regression networks. Moreover, it may be possible to design specialized network architectures that automatically satisfy some of the physical invariants for better accuracy, faster training and improved generalization. Here, we review some of the prevailing trends in embedding physics into machine learning, present some of the current capabilities and limitations and discuss diverse applications of physics-informed learning both for forward and inverse problems, including discovering hidden physics and tackling high-dimensional problems.

translated by 谷歌翻译

Physics-Informed Gaussian Process Regression Generalizes Linear PDE Solvers

Marvin Pförtner , Ingo Steinwart , Philipp Hennig , Jonathan Wenger

分类：机器学习 | (统计)机器学习

2022-12-23

Linear partial differential equations (PDEs) are an important, widely applied class of mechanistic models, describing physical processes such as heat transfer, electromagnetism, and wave propagation. In practice, specialized numerical methods based on discretization are used to solve PDEs. They generally use an estimate of the unknown model parameters and, if available, physical measurements for initialization. Such solvers are often embedded into larger scientific models or analyses with a downstream application such that error quantification plays a key role. However, by entirely ignoring parameter and measurement uncertainty, classical PDE solvers may fail to produce consistent estimates of their inherent approximation error. In this work, we approach this problem in a principled fashion by interpreting solving linear PDEs as physics-informed Gaussian process (GP) regression. Our framework is based on a key generalization of a widely-applied theorem for conditioning GPs on a finite number of direct observations to observations made via an arbitrary bounded linear operator. Crucially, this probabilistic viewpoint allows to (1) quantify the inherent discretization error; (2) propagate uncertainty about the model parameters to the solution; and (3) condition on noisy measurements. Demonstrating the strength of this formulation, we prove that it strictly generalizes methods of weighted residuals, a central class of PDE solvers including collocation, finite volume, pseudospectral, and (generalized) Galerkin methods such as finite element and spectral methods. This class can thus be directly equipped with a structured error estimate and the capability to incorporate uncertain model parameters and observations. In summary, our results enable the seamless integration of mechanistic models as modular building blocks into probabilistic models.

translated by 谷歌翻译

Data-Centric Engineering: integrating simulation, machine learning and statistics. Challenges and Opportunities

Indranil Pan , Lachlan Mason , Omar Matar

分类：机器学习

2021-11-07

机器学习的最新进展，加上低成本计算，廉价流传感器，数据存储和云技术的可用性导致了广泛的多学科研究活动，具有商业利益攸关方的重大兴趣和投资。基于物理方程式的机械模型，纯粹的数据驱动统计方法代表建模光谱的两端。新的混合动力车，以数据为中心的工程方法，利用世界各国和整合模拟和数据，都是一种强大的工具，具有对物理学科的变革影响。我们在集成模拟，机器学习和统计数据中审查了新兴领域的关键研究趋势和应用场景。我们突出了这种综合愿景可以解锁和概述阻止其实现的关键挑战的机会。我们还讨论了该领域的翻译方面的瓶颈以及现有劳动力和未来大学毕业生的长期上升要求。

translated by 谷歌翻译

Reactive Message Passing for Scalable Bayesian Inference

Dmitry Bagaev , Bert de Vries

分类：机器学习 | 人工智能

2021-12-25

我们将反应性消息传递（RMP）作为框架，用于在概率模型的因子图表示中执行基于时间表，鲁棒和可扩展的消息通过的基于消息传递的推断。 RMP基于反应性编程风格，该样式仅描述因子图中的节点如何对连接节点中的更改作出反应。没有固定消息传递计划提高推理过程的稳健性，可伸缩性和执行时间。我们还存在ReactiveMp.jl，这是一个Julia包，用于通过最小化约束的自由能实现RMP。通过用户定义的本地表单和分解约束对变分后部分布的结构，ReastiveMp.jl执行混合消息传递算法，包括信仰传播，变分消息通过，期望传播和期望最大化更新规则。实验结果表明，与其他概率模型的贝叶斯推断的其他朱莉娅封装相比，基于Reactivemp的RMP的性能提高。特别是，我们表明RMP框架能够为大型概率状态空间模型运行贝叶斯人推断，并在标准膝上型计算机上具有数十万个随机变量。

translated by 谷歌翻译

API design for machine learning software: experiences from the scikit-learn project

Lars Buitinck , Gilles Louppe , Mathieu Blondel , Fabian Pedregosa , Andreas Mueller , Olivier Grisel , Vlad Niculae , Peter Prettenhofer , Alexandre Gramfort , Jaques Grobler

分类：

2013-09-01

scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library.

translated by 谷歌翻译

Scientific Machine Learning through Physics-Informed Neural Networks: Where we are and What's next

Salvatore Cuomo , Vincenzo Schiano di Cola , Fabio Giampaolo , Gianluigi Rozza , Maziar Raissi , Francesco Piccialli

分类：机器学习 | 人工智能

2022-01-14

物理信息的神经网络（PINN）是神经网络（NNS），它们作为神经网络本身的组成部分编码模型方程，例如部分微分方程（PDE）。如今，PINN是用于求解PDE，分数方程，积分分化方程和随机PDE的。这种新颖的方法已成为一个多任务学习框架，在该框架中，NN必须在减少PDE残差的同时拟合观察到的数据。本文对PINNS的文献进行了全面的综述：虽然该研究的主要目标是表征这些网络及其相关的优势和缺点。该综述还试图将出版物纳入更广泛的基于搭配的物理知识的神经网络，这些神经网络构成了香草·皮恩（Vanilla Pinn）以及许多其他变体，例如物理受限的神经网络（PCNN），各种HP-VPINN，变量HP-VPINN，VPINN，VPINN，变体。和保守的Pinn（CPINN）。该研究表明，大多数研究都集中在通过不同的激活功能，梯度优化技术，神经网络结构和损耗功能结构来定制PINN。尽管使用PINN的应用范围广泛，但通过证明其在某些情况下比有限元方法（FEM）等经典数值技术更可行的能力，但仍有可能的进步，最著名的是尚未解决的理论问题。

translated by 谷歌翻译

Flashlight: Enabling Innovation in Tools for Machine Learning

Jacob Kahn , Vineel Pratap , Tatiana Likhomanenko , Qiantong Xu , Awni Hannun , Jeff Cai , Paden Tomasello , Ann Lee , Edouard Grave , Gilad Avidov

分类：机器学习 | 人工智能

2022-01-29

随着机器学习系统的计算要求以及机器学习框架的规模和复杂性的增加，基本框架创新变得具有挑战性。尽管计算需求驱动了最近的编译器，网络和硬件的进步，但通过机器学习工具对这些进步的利用却以较慢的速度发生。这部分是由于与现有框架原型制作新的计算范式有关的困难。大型框架将机器学习研究人员和从业人员作为最终用户的优先级优先，并且很少关注能够向前推动框架的系统研究人员 - 我们认为两者都是同等重要的利益相关者。我们介绍了手电筒，这是一个开源库，旨在通过优先考虑开放式，模块化，可定制的内部设备以及最新的，可用于研究的模型和培训设置，以刺激机器学习工具和系统的创新。手电筒使系统研究人员能够快速原型并尝试机器学习计算中的新思想，并且开销低，与其他流行的机器学习框架竞争并经常超过其他流行的机器学习框架。我们将手电筒视为一种工具，可以使可以使广泛使用的图书馆受益，并使机器学习和系统研究人员更加紧密地结合在一起。手电筒可从https://github.com/flashlight/flashlight获得。

translated by 谷歌翻译

IOHanalyzer: Detailed Performance Analyses for Iterative Optimization Heuristics

Hao Wang , Diederick Vermetten , Furong Ye , Carola Doerr , Thomas Bäck

分类：神经与进化计算

2020-07-08

基准和性能分析在理解迭代优化启发式（IOHS）的行为中发挥着重要作用，例如本地搜索算法，遗传和进化算法，贝叶斯优化算法等。然而，这项任务涉及手动设置，执行和分析实验单独的基础，这是艰苦的，可以通过通用和设计精心设计的平台来缓解。为此，我们提出了Iohanalyzer，一种用于分析，比较和可视化IOH的性能数据的新用户友好的工具。在R和C ++中实现，Iohanalyzer是完全开源的。它可以在Cran和GitHub上获得。 Iohanalyzer提供有关固定目标运行时间的详细统计信息以及具有实际值的Codomain，单目标优化任务的基准算法的固定预算性能。例如，在多个基准问题上的性能聚合是可能的，例如以经验累积分布函数的形式。 Iohanalyzer在其他性能分析包上的主要优点是其高度交互式设计，允许用户指定对其实验最有用的性能测量，范围和粒度，以及不仅分析性能迹线，还可以分析演变动态状态参数。 Iohanalyzer可以直接从主基准平台处理性能数据，包括Coco平台，JOVERRAD，SOS平台和iohExperenter。提供R编程接口，供用户更倾向于对实现的功能进行更精细的控制。

translated by 谷歌翻译

Open Geometry Prover Community Project

Nuno Baeta , Pedro Quaresma

分类：人工智能

2022-01-03

数学证据无疑是数学的基石。在过去几年的崛起，特别是自动几何理理普遍的计算和推理工具，已经丰富了我们对数学的经验。为避免差异努力，开放几何箴言社区项目旨在纳入不同努力，在普通的“伞”下的几何自动定理普罗维者的发展。在本文中，指定了这种集成的必要步骤，并描述了一些这些步骤的当前实现。

translated by 谷歌翻译