Assessing treatment effects in observational studies is a multifaceted problem that not only involves heterogeneous mechanisms of how the treatment or cause is exposed to subjects, known as propensity , but also differential causal effects across sub-populations. We introduce a concept termed the facilitating score to account for both the confounding and interacting impacts of covariates on the treatment effect. Several approaches for estimating the facilitating score are discussed. In particular , we put forward a machine learning method, called causal inference tree (CIT), to provide a piecewise constant approximation of the facilitating score. With interpretable rules, CIT splits data in such a way that both the propensity and the treatment effect become more homogeneous within each resultant partition. Causal inference at different levels can be made on the basis of CIT. Together with an aggregated grouping procedure, CIT stratifies data into strata where causal effects can be conveniently assessed within each. Besides, a feasible way of predicting individual causal effects (ICE) is made available by aggregating ensemble CIT models. Both the stratified results and the estimated ICE provide an assessment of heterogeneity of causal effects and can be integrated for estimating the average causal effect (ACE). Mean square consistency of CIT is also established. We evaluate the performance of proposed methods with simulations and illustrate their use with the NSW data in Dehejia and Wahba (1999) where the objective is to assess the impact of c 2012
translated by 谷歌翻译
在存在混淆的情况下估计干预的因果效应是医学等应用中经常出现的问题。任务具有挑战性,因为可能存在多种混淆因素,其中一些可能缺失,并且必须从高维,嘈杂的测量。在本文中,我们提出了一种决策理论方法,用于评估干预措施的因果效应,其中一部分协变量在测试期间对某些患者不可用。我们的方法使用信息瓶颈原理来执行协变量数据的离散的,低维度的有效减少,以估计超出配置的分布。在这样做时,我们可以估计干预的因果效应,其中只有部分协变量信息可用。我们对因果关系基准和治疗败血症的实际应用的结果表明,我们的方法在不牺牲可解释性的情况下实现了最先进的性能。
translated by 谷歌翻译
Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers. The most important aspect of inferring causal effects from observational data is the handling of confounders, factors that affect both an intervention and its outcome. A carefully designed observational study attempts to measure all important confounders. However, even if one does not have direct access to all confounders, there may exist noisy and uncertain measurement of proxies for confounders. We build on recent advances in latent variable modeling to simultaneously estimate the unknown latent space summarizing the confounders and the causal effect. Our method is based on Variational Autoencoders (VAE) which follow the causal structure of inference with proxies. We show our method is significantly more robust than existing methods, and matches the state-of-the-art on previous benchmarks focused on individual treatment effects.
translated by 谷歌翻译
来自观测数据的因果推断通常假定“强烈的无知性”,即观察到所有混杂因素。这种假设是标准的,但却是不可测试的。然而,许多科学研究涉及多种原因,不同的变量同时产生影响。我们提出了deconfounder,analgorithm,它结合了无监督机器学习和预测模型检查,以在多原因设置中执行因果推理。决策者推断潜在变量作为未观察到的混淆因子的替代,然后使用该替代来执行因果推理。当deberounder导致无偏见的因果估计时,我们发展理论,并表明它需要比经典因果推断更弱的假设。我们分析其在三种类型的研究中的表现:半模拟数据吸烟和肺癌,围绕全基因组关联研究的半模拟数据,以及关于演员和电影收入的真实数据集。该复合器提供了一种可检查的方法来估计接近真实的因果关系。
translated by 谷歌翻译
我们研究了从观测数据中学习个性化决策策略的问题,同时考虑了数据生成过程中可能存在的未观察到的混淆。以前的方法,假设没有混淆,即没有未观察到的混淆因素影响治疗分配和结果,可能导致政策实际上引入重大危害因为过度干预,当存在一些未观察到的混淆时,实际情况在大多数情况下处理观测数据的应用程序。相反,我们通过因果推理中的敏感性分析所激发的不确定性集来校准政策学习,以实现对这种无法验证的假设的现实判断。我们的混淆 - 健全政策改进框架优化了针对倾向性权重的不确定性设定的候选政策反对滥用基准标准政策的极小极大遗憾。我们证明,如果不确定性集是明确规定的,那么我们的健全政策在实践中应用时,不会比基线更差,并且如果可能的话也会改进。我们描述了对抗优化问题,并使用有效的算法解决方案来优化决策策略的过度参数化空间,例如后勤处理任务和决策树。我们评估我们的合成数据方法和急性缺血性卒中治疗的大型临床试验,证明隐藏的混淆可以阻碍现有的政策学习方法并导致危险的伤害,而我们强有力的方法保证安全性并着重于明显的改进,这是个性化的必要性治疗政策从实践中可靠的观测数据中学习。
translated by 谷歌翻译
大数据时代为研究人员提供了方便地访问丰富的数据。然而,人们往往对此知之甚少。大数据日益普及正在挑战传统的学习因果学方法,因为它们是针对数据量有限且前期因果知识稳固的案例而开发的。本调查旨在通过对传统和前沿方法的全面和结构化审查以及对学习因果关系的一些公开问题的讨论,缩小大数据与学习因果关系之间的差距。我们从学习因果关系的预备开始。然后我们分类并重新审视了典型问题和数据类型的学习因果关系的方法。之后,我们讨论学习因素和机器学习之间的联系。最后,提出了一些开放性问题,以显示学习因果关系与数据的巨大潜力。
translated by 谷歌翻译
Network models are widely used to represent relations between interacting units or actors. Network data often exhibit transitivity, meaning that two actors that have ties to a third actor are more likely to be tied than actors that do not, homophily by attributes of the actors or dyads, and clustering. Interest often focuses on finding clusters of actors or ties, and the number of groups in the data is typically unknown. We propose a new model, the latent position cluster model , under which the probability of a tie between two actors depends on the distance between them in an unobserved Euclidean 'social space', and the actors' locations in the latent social space arise from a mixture of distributions, each corresponding to a cluster. We propose two estimation methods: a two-stage maximum likelihood method and a fully Bayesian method that uses Markov chain Monte Carlo sampling. The former is quicker and simpler, but the latter performs better. We also propose a Bayesian way of determining the number of clusters that are present by using approximate conditional Bayes factors. Our model represents transitivity, homophily by attributes and clustering simultaneously and does not require the number of clusters to be known. The model makes it easy to simulate realistic networks with clustering, which are potentially useful as inputs to models of more complex systems of which the network is part, such as epidemic models of infectious disease. We apply the model to two networks of social relations. A free software package in the R statistical language, latentnet, is available to analyse data by using the model.
translated by 谷歌翻译
Predicated on the increasing abundance of electronic health records, we investigate the problem of inferring individualized treatment effects using observational data. Stemming from the potential outcomes model, we propose a novel multi-task learning framework in which factual and counterfactual outcomes are mod-eled as the outputs of a function in a vector-valued reproducing kernel Hilbert space (vvRKHS). We develop a nonparametric Bayesian method for learning the treatment effects using a multi-task Gaussian process (GP) with a linear coregion-alization kernel as a prior over the vvRKHS. The Bayesian approach allows us to compute individualized measures of confidence in our estimates via pointwise credible intervals, which are crucial for realizing the full potential of precision medicine. The impact of selection bias is alleviated via a risk-based empirical Bayes method for adapting the multi-task GP prior, which jointly minimizes the empirical error in factual outcomes and the uncertainty in (unobserved) counter-factual outcomes. We conduct experiments on observational datasets for an inter-ventional social program applied to premature infants, and a left ventricular assist device applied to cardiac patients wait-listed for a heart transplant. In both experiments , we show that our method significantly outperforms the state-of-the-art.
translated by 谷歌翻译
统计学家在创建减少对参数假设的依赖的方法方面取得了很大进展。然而,这种研究的爆炸式增长导致了广泛的推理策略,这些策略既为更可靠的推理创造了机会,又使应用调查员必须做出和保护的选择复杂化。相关地,倡导新方法的研究人员通常将他们的方法与最好的2或3个其他因果推理策略进行比较,并使用可能会或可能不会设计的模拟来测试所有竞争方法中的缺陷。作为2016年大西洋因果推理会议的一部分推出的因果推断数据分析挑战“你的SATT在哪里吗?”试图在尊重这些问题方面取得进展。创建数据测试基础的研究人员从研究人员提交的方法中提出了其效力得到评估的方法。来自两个版本的竞争中的30个竞争者的结果(黑盒算法和自己动手分析)与后期分析一起呈现,其揭示了关于影响性能的因果推断策略和设置的特征的信息。最一致的结论是,灵活地模拟响应表面的方法总体上比不能这样做的方法更好。最后提出了一些新方法,它们结合了几种表现最好的提交方法的特征。
translated by 谷歌翻译
识别模型参数的变化是机器学习和统计的基础。然而,标准变点模型是有限的无效性,通常解决一维问题并假设是瞬时变化。我们引入变化曲面作为变化点的多维和高度表达的推广。我们提供了变换曲面的模型不可知形式化,说明了它们如何在多个维度上提供变量,异构和非单调变化率。此外,我们还展示了变换曲面如何用于反事实预测。作为变换面框架的具体实例,我们开发了高斯过程变换曲面(GPCS)。我们通过引入加性不可分离核的新方法证明了贝叶斯后验均值和可信集的反事实预测,以及大规模可扩展性。利用两个大的时空数据集,我们使用GPCS来发现和表征可以提供科学和政策相关权威的复杂变化。具体而言,我们分析了美国的二十世纪麻疹发病率,并在引入麻疹疫苗后发现了以前未知的异质性变化。此外,我们将该模型应用于纽约市的铅检测试剂盒,发现不同的空间人口统计模式。
translated by 谷歌翻译
Complex chronic diseases (e.g., autism, lupus, and Parkinson's) are remarkably heterogeneous across individuals. This heterogeneity makes treatment difficult for caregivers because they cannot accurately predict the way in which the disease will progress in order to guide treatment decisions. Therefore, tools that help to predict the trajectory of these complex chronic diseases can help to improve the quality of health care. To build such tools, we can leverage clinical markers that are collected at baseline when a patient first presents and longitudinally over time during follow-up visits. Because complex chronic diseases are typically systemic, the longitudinal markers often track disease progression in multiple organ systems. In this paper, our goal is to predict a function of time that models the future trajectory of a single target clinical marker tracking a disease process of interest. We want to make these predictions using the histories of many related clinical markers as input. Our proposed solution tackles several key challenges. First, we can easily handle irregularly and sparsely sampled markers, which are standard in clinical data. Second, the number of parameters and the computational complexity of learning our model grows linearly in the number of marker types included in the model. This makes our approach applicable to diseases where many different markers are recorded over time. Finally, our model accounts for latent factors influencing disease expression, whereas standard regression models rely on observed features alone to explain variability. Moreover, our approach can be applied dynamically in continous-time and updates its predictions as soon as any new data is available. We apply our approach to the problem of predicting lung disease trajectories in scleroderma, a complex autoimmune disease. We show that our model improves over state-of-the-art baselines in predictive accuracy and we provide a qualitative analysis of our model's output. Finally, the variability of disease presentation in sclero-derma makes clinical trial recruitment challenging. We show that a prognostic tool that integrates multiple types of routinely collected longitudinal data can be used to identify individuals at greatest risk of rapid progression and to target trial recruitment.
translated by 谷歌翻译
从观察数据中学习因果效应极大地有益于各种领域,如医疗保健,教育和社会学。例如,人们可以估计政策对降低失业率的影响。因果效应推断的核心问题是处理未观察到的事实因素和治疗选择偏差。最先进的方法通过平衡治疗组和对照组来集中解决这些问题。然而,在学习和平衡过程中,来自原始协变量空间的高度预测信息可能会丢失。为了构建更强大的估计器,我们通过基于深度学习的最新进展,提出了一种基于Adversarial Balancing的基于CausalEffect Inference(ABCEI)的表示学习的方法来解决这一信息丢失问题。 ABCE使用对抗性学习来平衡潜在表征空间中治疗组和对照组的分布,而不对治疗选择/分配功能的形式进行任何假设。 ABCEI保留了有用的信息,用于预测互信息估算器正规化下的因果效应。我们在几个合成和现实世界的数据集上进行了各种实验。实验结果表明,ABCEI对治疗选择偏差具有很强的抵抗力,并且匹配/优于最先进的方法。
translated by 谷歌翻译
在存在混淆的情况下估计干预的因果效应是医学等应用中经常出现的问题。任务具有挑战性,因为可能存在多种混淆因素,其中一些可能缺失,并且必须从高维,嘈杂的测量。在本文中,我们提出了一种决策理论方法,用于评估干预措施的因果效应,其中一部分协变量在测试期间对某些患者不可用。我们的方法使用信息瓶颈原理来执行协变量数据的离散的,低维度的有效减少,以估计超出配置的分布。在这样做时,我们可以估计干预的因果效应,其中只有部分协变量信息可用。我们对因果关系基准和治疗败血症的实际应用的结果表明,我们的方法在不牺牲可解释性的情况下实现了最先进的性能。
translated by 谷歌翻译
随着个性化决策(例如个性化医学和在线推荐)的日益增长的需求,人们越来越关注因果关系的背景和异质性的发现。然而,大多数现有方法假定已知原因(例如新药)和焦点从数据中识别出因异常效应的背景(例如对新药具有不同反应的患者群体)。没有方法可以直接有效地从观察数据上下文中检测特定的因果关系,即同时发现原因及其背景。本文利用高效决策树归纳的优势和完善的因果推理框架,提出了基于树的上下文因果规则发现(TCC)方法,从数据中对上下文特定的因果关系进行了有效的探索。合成与实际的实验世界数据集表明,TCC可以从数据中有效地发现特定于上下文的因果规则。
translated by 谷歌翻译
我们如何从有偏见的数据中学习?历史数据集通常反映历史偏见;敏感或受保护的属性可能会影响观察到的治疗和结果。负责从这些数据集中准确预测结果的分类算法倾向于复制这些偏差。我们提倡采用因果建模方法来学习有偏见的数据和重复公平分类作为干预问题。我们提出了一个causalmodel,其敏感属性混淆了治疗和结果。在深度学习和生成建模的先前工作的基础上,我们描述了如何仅从观察数据中学习这种因果模型的参数,即使存在未观察到的混淆因素。我们实验证明,公平感知因果建模可以更好地估计敏感属性,治疗和结果之间的因果效应。我们进一步提出证据表明,当用历史偏向的数据集进行呈现时,估计这些因果效应可以帮助我们学习更准确和公平的政策。
translated by 谷歌翻译
Data with mixed-type (metric-ordinal-nominal) variables are typical for social strat-ification, i.e. partitioning a population into social classes. Approaches to cluster such data are compared, namely a latent class mixture model assuming local independence and dissimilar-ity-based methods such as k-medoids. The design of an appropriate dissimilarity measure and the estimation of the number of clusters are discussed as well, comparing the Bayesian information criterion with dissimilarity-based criteria. The comparison is based on a philosophy of cluster analysis that connects the problem of a choice of a suitable clustering method closely to the application by considering direct interpretations of the implications of the methodology. The application of this philosophy to economic data from the 2007 US Survey of Consumer Finances demonstrates techniques and decisions required to obtain an interpretable clustering. The clustering is shown to be significantly more structured than a suitable null model. One result is that the data-based strata are not as strongly connected to occupation categories as is often assumed in the literature.
translated by 谷歌翻译
在最近关于估计异质性治疗效果的文献中,每种提出的方​​法都对干预的影响做出了自己的一套限制性假设,以及明确估计哪些亚种群。此外,大多数文献没有提供任何机制来确定哪些亚种群受影响最大 - 超出手册范围。因此,我们提出了治疗效果子集扫描(TESS),这是一种发现随机实验中哪个亚群受治疗影响最大的新方法。我们将此挑战定位为模式检测问题,我们有效地最大化非参数扫描统计信息子群体。此外,我们确定了由于干预而经历最大分布变化的子群体,同时对干预效果或基础数据生成过程做出最小的假设。除了算法之外,我们还证明了可以控制渐近的I型和II型误差,并为检测一致性提供了充分的条件 - 即,对受影响的亚种群进行了exactidentification。最后,我们通过发现模拟中的异质治疗效果和来自众所周知的项目评估研究的实际数据来验证该方法的有效性。
translated by 谷歌翻译
揭示各种粒度级别的政策和业务决策的因果效应的异质性,为决策者提供了实质性的价值。本文通过修改Wager和Athey(2018)提出的因果森林方法,在可观察选择框架上开发了多种治疗模型的新估计和推理程序。新的估计量具有理想的理论和计算属性,适用于因果效应的各种聚合水平。经验性Monte Carlustudy研究显示,它们可能优于之前建议的估算因子。推论对于与较大群体相关的效果和与精细粒度水平相关的保守效果是准确的。应用于评估积极的劳动力市场计划表明了应用研究的新方法的价值。
translated by 谷歌翻译
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to over-confident inferences and decisions that are more risky than one thinks they are. Bayesian model averaging (BMA) provides a coherent mechanism for accounting for this model uncertainty. Several methods for implementing BMA have recently emerged. We discuss these methods and present a number of examples. In these examples, BMA provides improved out-of-sample predictive performance. We also provide a catalogue of currently available BMA software.
translated by 谷歌翻译
What is the difference between a prediction that is made with a causal model and that with a non-causal model? Suppose that we intervene on the predictor variables or change the whole environment. The predictions from a causal model will in general work as well under interventions as for observational data. In contrast, predictions from a non-causal model can potentially be very wrong if we actively intervene on variables. Here, we propose to exploit this invariance of a prediction under a causal model for causal inference: given different experimental settings (e.g. various interventions) we collect all models that do show invari-ance in their predictive accuracy across settings and interventions. The causal model will be a member of this set of models with high probability. This approach yields valid confidence intervals for the causal relationships in quite general scenarios. We examine the example of structural equation models in more detail and provide sufficient assumptions under which the set of causal predictors becomes identifiable. We further investigate robustness properties of our approach under model misspecification and discuss possible extensions. The empirical properties are studied for various data sets, including large-scale gene perturbation experiments .
translated by 谷歌翻译