智能论文笔记

Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

Mizu Nishikawa-Toomey , Tristan Deleu , Jithendaraa Subramanian , Yoshua Bengio , Laurent Charlin

分类：机器学习 | (统计)机器学习

2022-11-04

Bayesian causal structure learning aims to learn a posterior distribution over directed acyclic graphs (DAGs), and the mechanisms that define the relationship between parent and child variables. By taking a Bayesian approach, it is possible to reason about the uncertainty of the causal model. The notion of modelling the uncertainty over models is particularly crucial for causal structure learning since the model could be unidentifiable when given only a finite amount of observational data. In this paper, we introduce a novel method to jointly learn the structure and mechanisms of the causal model using Variational Bayes, which we call Variational Bayes-DAG-GFlowNet (VBG). We extend the method of Bayesian causal structure learning using GFlowNets to learn not only the posterior distribution over the structure, but also the parameters of a linear-Gaussian model. Our results on simulated data suggest that VBG is competitive against several baselines in modelling the posterior over DAGs and mechanisms, while offering several advantages over existing methods, including the guarantee to sample acyclic graphs, and the flexibility to generalize to non-linear causal mechanisms.

translated by 谷歌翻译

Bayesian Structure Learning with Generative Flow Networks

Tristan Deleu , António Góis , Chris Emezue , Mansi Rankawat , Simon Lacoste-Julien , Stefan Bauer , Yoshua Bengio

分类：机器学习 | (统计)机器学习

2022-02-28

在贝叶斯结构学习中，我们有兴趣从数据中推断出贝叶斯网络的定向无环图（DAG）结构。由于组合较大的样本空间，定义这种分布非常具有挑战性，并且通常需要基于MCMC的近似值。最近，已引入了一种新型的概率模型，称为生成流网络（GFLOWNETS），作为离散和复合对象（例如图形）生成建模的一般框架。在这项工作中，我们建议使用GFLOWNET作为MCMC的替代方案，以近似贝叶斯网络结构的后验分布，给定观测数据集。从该近似分布中生成样本DAG被视为一个顺序决策问题，在该问题中，该图是根据学习的过渡概率一次构造一个边缘的。通过对模拟和真实数据的评估，我们表明我们的方法称为dag-gflownet，可以准确地近似DAG，并且它可以与基于MCMC或变异推断的其他方法进行比较。

translated by 谷歌翻译

BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery

Chris Cundy , Aditya Grover , Stefano Ermon

分类：机器学习 | 人工智能 | (统计)机器学习

2021-12-06

结构方程模型（SEM）是一种有效的框架，其原因是通过定向非循环图（DAG）表示的因果关系。最近的进步使得能够从观察数据中实现了DAG的最大似然点估计。然而，在实际场景中，可以不能准确地捕获在推断下面的底层图中的不确定性，其中真正的DAG是不可识别的并且/或观察到的数据集是有限的。我们提出了贝叶斯因果发现网（BCD网），一个变分推理框架，用于估算表征线性高斯SEM的DAG的分布。由于图形的离散和组合性质，开发一个完整的贝叶斯后面是挑战。我们通过表达变分别家庭分析可扩展VI的可扩展VI的关键设计选择，例如1）表达性变分别家庭，2）连续弛豫，使低方差随机优化和3）在潜在变量上具有合适的前置。我们提供了一系列关于实际和合成数据的实验，显示BCD网在低数据制度中的标准因果发现度量上的最大似然方法，例如结构汉明距离。

translated by 谷歌翻译

Tractable Uncertainty for Structure Learning

Benjie Wang , Matthew Wicker , Marta Kwiatkowska

分类：机器学习 | 人工智能 | (统计)机器学习

2022-04-29

贝叶斯结构学习允许人们对负责生成给定数据的因果定向无环图（DAG）捕获不确定性。在这项工作中，我们提出了结构学习（信任）的可疗法不确定性，这是近似后推理的框架，依赖于概率回路作为我们后验信仰的表示。与基于样本的后近似值相反，我们的表示可以捕获一个更丰富的DAG空间，同时也能够通过一系列有用的推理查询来仔细地理解不确定性。我们从经验上展示了如何将概率回路用作结构学习方法的增强表示，从而改善了推断结构和后部不确定性的质量。有条件查询的实验结果进一步证明了信任的表示能力的实际实用性。

translated by 谷歌翻译

Latent Variable Models for Bayesian Causal Discovery

Jithendaraa Subramanian , Yashas Annadani , Ivaxi Sheth , Stefan Bauer , Derek Nowrouzezahrai , Samira Ebrahimi Kahou

分类：机器学习 | 人工智能 | (统计)机器学习

2022-07-12

不依赖虚假相关性的学习预测因素涉及建立因果关系。但是，学习这样的表示非常具有挑战性。因此，我们制定了从高维数据中学习因果表示的问题，并通过合成数据研究因果恢复。这项工作引入了贝叶斯因果发现的潜在变量解码器模型BCD，并在轻度监督和无监督的环境中进行实验。我们提出了一系列合成实验，以表征因果发现的重要因素，并表明将已知的干预靶标用作标签有助于无监督的贝叶斯推断，对线性高斯添加噪声潜在结构性因果模型的结构和参数。

translated by 谷歌翻译

DiBS: Differentiable Bayesian Structure Learning

Lars Lorch , Jonas Rothfuss , Bernhard Schölkopf , Andreas Krause

分类：机器学习 | (统计)机器学习

2021-05-25

贝叶斯结构学习允许从数据推断贝叶斯网络结构，同时推理认识性不确定性 - 朝着实现现实世界系统的主动因果发现和设计干预的关键因素。在这项工作中，我们为贝叶斯结构学习（DIBS）提出了一般，完全可微分的框架，其在潜在概率图表表示的连续空间中运行。与现有的工作相反，DIBS对局部条件分布的形式不可知，并且允许图形结构和条件分布参数的关节后部推理。这使得我们的配方直接适用于复杂贝叶斯网络模型的后部推理，例如，具有由神经网络编码的非线性依赖性。使用DIBS，我们设计了一种高效，通用的变分推理方法，用于近似结构模型的分布。在模拟和现实世界数据的评估中，我们的方法显着优于关节后部推理的相关方法。

translated by 谷歌翻译

MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

Erdun Gao , Ignavier Ng , Mingming Gong , Li Shen , Wei Huang , Tongliang Liu , Kun Zhang , Howard Bondell

分类：机器学习 | (统计)机器学习

2022-05-27

State-of-the-art causal discovery methods usually assume that the observational data is complete. However, the missing data problem is pervasive in many practical scenarios such as clinical trials, economics, and biology. One straightforward way to address the missing data problem is first to impute the data using off-the-shelf imputation methods and then apply existing causal discovery methods. However, such a two-step method may suffer from suboptimality, as the imputation algorithm may introduce bias for modeling the underlying data distribution. In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations. Focusing mainly on the assumptions of ignorable missingness and the identifiable additive noise models (ANMs), MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization (EM) framework. In the E-step, in cases where computing the posterior distributions of parameters in closed-form is not feasible, Monte Carlo EM is leveraged to approximate the likelihood. In the M-step, MissDAG leverages the density transformation to model the noise distributions with simpler and specific formulations by virtue of the ANMs and uses a likelihood-based causal discovery algorithm with directed acyclic graph constraint. We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.

translated by 谷歌翻译

Deep End-to-end Causal Inference

Tomas Geffner , Javier Antoran , Adam Foster , Wenbo Gong , Chao Ma , Emre Kiciman , Amit Sharma , Angus Lamb , Martin Kukla , Nick Pawlowski

分类： (统计)机器学习 | 机器学习

2022-02-04

因果推断对于跨业务参与，医疗和政策制定等领域的数据驱动决策至关重要。然而，关于因果发现的研究已经与推理方法分开发展，从而阻止了两个领域方法的直接组合。在这项工作中，我们开发了深层端到端因果推理（DECI），这是一种基于流动的非线性添加噪声模型，该模型具有观察数据，并且可以执行因果发现和推理，包括有条件的平均治疗效果（CATE））估计。我们提供了理论上的保证，即DECI可以根据标准因果发现假设恢复地面真实因果图。受应用影响的激励，我们将该模型扩展到具有缺失值的异质，混合型数据，从而允许连续和离散的治疗决策。我们的结果表明，与因果发现的相关基线相比，DECI的竞争性能和（c）在合成数据集和因果机器学习基准测试基准的一千多个实验中，跨数据类型和缺失水平进行了估计。

translated by 谷歌翻译

Interventions, Where and How? Experimental Design for Causal Models at Scale

Panagiotis Tigas , Yashas Annadani , Andrew Jesson , Bernhard Schölkopf , Yarin Gal , Stefan Bauer

分类：机器学习 | 人工智能 | (统计)机器学习

2022-03-03

由于数据有限和非识别性，观察性和介入数据的因果发现是具有挑战性的：在估计基本结构因果模型（SCM）时引入不确定性的因素。基于这两个因素引起的不确定性选择实验（干预措施）可以加快SCM的识别。来自有限数据的因果发现实验设计中的现有方法要么依赖于SCM的线性假设，要么仅选择干预目标。这项工作将贝叶斯因果发现的最新进展纳入了贝叶斯最佳实验设计框架中，从而使大型非线性SCM的积极因果发现同时选择了介入目标和值。我们证明了对线性和非线性SCM的合成图（ERDOS-R \'enyi，breetr cable）以及在\ emph {intiLico}单细胞基因调节网络数据集的\ emph {inyeare scms的性能。

translated by 谷歌翻译

Causal Structure Learning: a Combinatorial Perspective

Chandler Squires , Caroline Uhler

分类：机器学习

2022-06-02

In this review, we discuss approaches for learning causal structure from data, also called causal discovery. In particular, we focus on approaches for learning directed acyclic graphs (DAGs) and various generalizations which allow for some variables to be unobserved in the available data. We devote special attention to two fundamental combinatorial aspects of causal structure learning. First, we discuss the structure of the search space over causal graphs. Second, we discuss the structure of equivalence classes over causal graphs, i.e., sets of graphs which represent what can be learned from observational data alone, and how these equivalence classes can be refined by adding interventional data.

translated by 谷歌翻译

Active Learning for Optimal Intervention Design in Causal Models

Jiaqi Zhang , Louis Cammarata , Chandler Squires , Themistoklis P. Sapsis , Caroline Uhler

分类：机器学习

2022-09-10

跨学科的一个重要问题是发现产生预期结果的干预措施。当可能的干预空间很大时，需要进行详尽的搜索，需要实验设计策略。在这种情况下，编码变量之间的因果关系以及因此对系统的影响，对于有效地确定理想的干预措施至关重要。我们开发了一种迭代因果方法来识别最佳干预措施，这是通过分布后平均值和所需目标平均值之间的差异来衡量的。我们制定了一种主动学习策略，该策略使用从不同干预措施中获得的样本来更新有关基本因果模型的信念，并确定对最佳干预措施最有用的样本，因此应在下一批中获得。该方法采用了因果模型的贝叶斯更新，并使用精心设计的，有因果关系的收购功能优先考虑干预措施。此采集函数以封闭形式进行评估，从而有效优化。理论上以信息理论界限和可证明的一致性结果在理论上基于理论上的算法。我们说明了综合数据和现实世界生物学数据的方法，即来自worturb-cite-seq实验的基因表达数据，以识别诱导特定细胞态过渡的最佳扰动；与几个基线相比，观察到所提出的因果方法可实现更好的样品效率。在这两种情况下，我们都认为因果知情的采集函数尤其优于现有标准，从而允许使用实验明显更少的最佳干预设计。

translated by 谷歌翻译

Causal Entropy Optimization

Nicola Branchini , Virginia Aglietti , Neil Dhir , Theodoros Damoulas

分类：机器学习

2022-08-23

我们研究了全球优化因果关系变量的因果关系变量的问题，在该目标变量中可以进行干预措施。这个问题在许多科学领域都引起，包括生物学，运营研究和医疗保健。我们提出了因果熵优化（CEO），该框架概括了因果贝叶斯优化（CBO），以说明所有不确定性来源，包括由因果图结构引起的。首席执行官在因果效应的替代模型中以及用于通过信息理论采集函数选择干预措施的机制中纳入了因果结构的不确定性。所得算法自动交易结构学习和因果效应优化，同时自然考虑观察噪声。对于各种合成和现实世界的结构性因果模型，与CBO相比，CEO可以更快地与全局最佳达到融合，同时还可以学习图形。此外，我们的结构学习和因果优化的联合方法在顺序的结构学习优先方法上改善了。

translated by 谷歌翻译

GFlowNet Foundations

Yoshua Bengio , Tristan Deleu , Edward J. Hu , Salem Lahlou , Mo Tiwari , Emmanuel Bengio

分类：机器学习 | 人工智能 | (统计)机器学习

2021-11-17

已经引入了生成流量网络（GFlowNETS）作为在主动学习背景下采样多样化候选的方法，具有培训目标，其使它们与给定奖励功能成比例地进行比例。在本文中，我们显示了许多额外的GFLOWN的理论特性。它们可用于估计联合概率分布和一些变量未指定的相应边际分布，并且特别感兴趣地，可以代表像集合和图形的复合对象的分布。 Gflownets摊销了通常通过计算昂贵的MCMC方法在单个但训练有素的生成通行证中进行的工作。它们还可用于估计分区功能和自由能量，给定子集（子图）的超标（超图）的条件概率，以及给定集合（图）的所有超标仪（超图）的边际分布。我们引入了熵和相互信息估计的变体，从帕累托前沿采样，与奖励最大化策略的连接，以及随机环境的扩展，连续动作和模块化能量功能。

translated by 谷歌翻译

Large-Scale Differentiable Causal Discovery of Factor Graphs

Romain Lopez , Jan-Christian Hütter , Jonathan K. Pritchard , Aviv Regev

分类： (统计)机器学习 | 机器学习

2022-06-15

因果推断的一个共同主题是学习观察到的变量（也称为因果发现）之间的因果关系。考虑到大量候选因果图和搜索空间的组合性质，这通常是一项艰巨的任务。也许出于这个原因，到目前为止，大多数研究都集中在相对较小的因果图上，并具有多达数百个节点。但是，诸如生物学之类的领域的最新进展使生成实验数据集，并进行了数千种干预措施，然后进行了数千个变量的丰富分析，从而增加了机会和迫切需要大量因果图模型。在这里，我们介绍了因子定向无环图（F-DAG）的概念，是将搜索空间限制为非线性低级别因果相互作用模型的一种方法。将这种新颖的结构假设与最近的进步相结合，弥合因果发现与连续优化之间的差距，我们在数千个变量上实现了因果发现。此外，作为统计噪声对此估计程序的影响的模型，我们根据随机图研究了F-DAG骨架的边缘扰动模型，并量化了此类扰动对F-DAG等级的影响。该理论分析表明，一组候选F-DAG比整个DAG空间小得多，因此在很难评估基础骨架的高维度中更统计学上的稳定性。我们提出了因子图（DCD-FG）的可区分因果发现，这是对高维介入数据的F-DAG约束因果发现的可扩展实现。 DCD-FG使用高斯非线性低级结构方程模型，并且在模拟中的最新方法以及最新的大型单细胞RNA测序数据集中，与最新方法相比显示出显着改善遗传干预措施。

translated by 谷歌翻译

Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown Interventions

Juan L. Gamella , Armeen Taeb , Christina Heinze-Deml , Peter Bühlmann

分类： (统计)机器学习

2022-11-27

We consider the problem of recovering the causal structure underlying observations from different experimental conditions when the targets of the interventions in each experiment are unknown. We assume a linear structural causal model with additive Gaussian noise and consider interventions that perturb their targets while maintaining the causal relationships in the system. Different models may entail the same distributions, offering competing causal explanations for the given observations. We fully characterize this equivalence class and offer identifiability results, which we use to derive a greedy algorithm called GnIES to recover the equivalence class of the data-generating model without knowledge of the intervention targets. In addition, we develop a novel procedure to generate semi-synthetic data sets with known causal ground truth but distributions closely resembling those of a real data set of choice. We leverage this procedure and evaluate the performance of GnIES on synthetic, real, and semi-synthetic data sets. Despite the strong Gaussian distributional assumption, GnIES is robust to an array of model violations and competitive in recovering the causal structure in small- to large-sample settings. We provide, in the Python packages "gnies" and "sempler", implementations of GnIES and our semi-synthetic data generation procedure.

translated by 谷歌翻译

Amortized Inference for Causal Structure Learning

Lars Lorch , Scott Sussex , Jonas Rothfuss , Andreas Krause , Bernhard Schölkopf

分类：机器学习 | (统计)机器学习

2022-05-25

Inferring causal structure poses a combinatorial search problem that typically involves evaluating structures with a score or independence test. The resulting search is costly, and designing suitable scores or tests that capture prior knowledge is difficult. In this work, we propose to amortize causal structure learning. Rather than searching over structures, we train a variational inference model to directly predict the causal structure from observational or interventional data. This allows our inference model to acquire domain-specific inductive biases for causal discovery solely from data generated by a simulator, bypassing both the hand-engineering of suitable score functions and the search over graphs. The architecture of our inference model emulates permutation invariances that are crucial for statistical efficiency in structure learning, which facilitates generalization to significantly larger problem instances than seen during training. On synthetic data and semisynthetic gene expression data, our models exhibit robust generalization capabilities when subject to substantial distribution shifts and significantly outperform existing algorithms, especially in the challenging genomics domain. Our code and models are publicly available at: https://github.com/larslorch/avici.

translated by 谷歌翻译

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Wenhao Ding , Haohong Lin , Bo Li , Ding Zhao

分类：机器学习 | 人工智能 | 机器人

2022-07-19

作为在人类智能中获得可推广的解决方案的关键组成部分，推理为加强学习（RL）代理人对各种目标的概括提供了巨大的潜力，这是通过汇总部分到全部的论点并发现因果关系的。但是，如何发现和代表因果关系仍然是阻碍因果RL发展的巨大差距。在本文中，我们使用因果图（CG）增强目标条件的RL（GCRL），该结构是基于对象和事件之间的关系建立的。我们在小新生中将GCRL问题提出为变异的可能性最大化，将CG作为潜在变量。为了优化派生目标，我们提出了一个具有理论性能的框架，可以保证在两个步骤之间交替：使用介入数据来估计CG的后验；使用CG学习可推广的模型和可解释的政策。由于缺乏在推理下验证概括能力的公共基准测试，我们设计了九个任务，然后从经验上显示了对这些任务上五个基准的拟议方法的有效性。进一步的理论分析表明，我们的绩效提高归因于因果发现，过渡建模和政策培训的良性周期，这与广泛消融研究中的实验证据相吻合。

translated by 谷歌翻译

Near-Optimal Multi-Perturbation Experimental Design for Causal Structure Learning

Scott Sussex , Andreas Krause , Caroline Uhler

分类：机器学习

2021-05-28

因果结构学习是许多领域的关键问题。通过对感兴趣系统进行实验来学习因果结构。我们解决了设计一批实验的主要原因，每个实验中同时干预多个变量。虽然可能比常用的单变干预措施更具信息丰富，但选择这种干预措施是更具挑战性的，这是由于复合干预措施的双指数组合搜索空间。在本文中，我们开发有效的算法，以优化量化预算限制批次实验的信息性的不同目标函数。通过建立这些目标的新型子模具性质，我们为我们的算法提供近似保证。我们的算法经验上优于随机干预和算法，只能选择单变化干预。

translated by 谷歌翻译

Variational Flow Graphical Model

Shaogang Ren , Belhal Karimi , Dingcheng Li , Ping Li

分类： (统计)机器学习 | 人工智能 | 机器学习

2022-07-06

本文介绍了一种具有层次结构的基于流的模型的新方法。所提出的框架被命名为变分流图形（VFG）模型。 VFG通过通过变异推理集成基于流的功能，通过消息通话方案来学习高维数据的表示。通过利用神经网络的表达能力，VFGS使用较低的维度产生数据的表示，从而克服了许多基于流动的模型的缺点，通常需要具有许多涉及许多琐事变量的高维度空间。在VFG模型中介绍了聚合节点，以通过消息传递方案集成前回溯分层信息。最大化数据可能性的证据下限（ELBO）在每个聚合节点中的向前和向后消息都能使一个一致性节点状态对齐。已经开发了算法来通过有关ELBO目标的梯度更新来学习模型参数。聚集节点的一致性使VFGS适用于图形结构的可牵引性推断。除了表示学习和数值推断外，VFG还提供了一种在具有图形潜在结构的数据集上分发建模的新方法。此外，理论研究表明，通过利用隐式可逆基于流动的结构，VFG是通用近似值。凭借灵活的图形结构和出色的过度功率，VFG可以可能用于改善概率推断。在实验中，VFGS在多个数据集上实现了改进的证据下限（ELBO）和似然值。

translated by 谷歌翻译

Efficient Sampling and Structure Learning of Bayesian Networks

Jack Kuipers , Polina Suter , Giusi Moffa

分类： (统计)机器学习 | 机器学习

2018-03-21

贝叶斯网络是概率的图形模型，广泛用于了解高维数据的依赖关系，甚至促进因果发现。学习作为定向的非循环图（DAG）编码的底层网络结构是高度具有挑战性的，主要是由于大量可能的网络与非狭窄性约束结合。努力专注于两个前面：基于约束的方法，该方法执行条件独立测试，以排除具有贪婪或MCMC方案的DAG空间的边缘和分数和搜索方法。在这里，我们以一种新的混合方法综合这两个领域，这降低了基于约束方法的MCMC方法的复杂性。 MCMC方案中的各个步骤仅需要简单的表查找，以便可以有效地获得非常长的链。此外，该方案包括迭代过程，以校正来自条件独立测试的错误。该算法对替代方案提供了显着卓越的性能，特别是因为也可以从后部分布采样DAG，从而实现全面的贝叶斯模型为大量较大的贝叶斯网络进行平均。

translated by 谷歌翻译