基于时间点过程的机器学习模型是在连续时间内涉及离散事件的各种应用中的技术的技术。但是,这些模型缺乏回答反事实问题的能力,因为这些模型正在用于通知有针对性的干预措施越来越相关。在这项工作中,我们的目标是填补这个差距。为此,我们首先开发一种因果点流程的变薄模型,这些过程构建在Gumbel-Max结构因果模型上。该模型满足所需的反事实单调性条件,足以识别稀疏过程中的反事实动态。然后,考虑到具有给定强度函数的时间点处理的观察到实现,我们开发了一种采样算法,该采样算法使用上述变化的因果模型和叠加定理来模拟给定的替代强度函数下的时间点处理的反事实实现。使用综合性和实际流行病学数据的仿真实验表明,我们的算法提供的反事实实现可以提供有价值的见解来增强目标干预措施。
translated by 谷歌翻译
Multiple lines of evidence strongly suggest that infection hotspots, where a single individual infects many others, play a key role in the transmission dynamics of COVID-19. However, most of the existing epidemiological models fail to capture this aspect by neither representing the sites visited by individuals explicitly nor characterizing disease transmission as a function of individual mobility patterns. In this work, we introduce a temporal point process modeling framework that specifically represents visits to the sites where individuals get in contact and infect each other. Under our model, the number of infections caused by an infectious individual naturally emerges to be overdispersed. Using an efficient sampling algorithm, we demonstrate how to estimate the transmission rate of infectious individuals at the sites they visit and in their households using Bayesian optimization and longitudinal case data. Simulations using fine-grained and publicly available demographic data and site locations from Bern, Switzerland showcase the flexibility of our framework. To facilitate research and analyses of other cities and regions, we release an open-source implementation of our framework.
translated by 谷歌翻译
决策者需要在采用新的治疗政策之前预测结果的发展,该政策定义了何时以及如何连续地影响结果的治疗序列。通常,预测介入的未来结果轨迹的算法将未来治疗的固定顺序作为输入。这要么忽略了未来治疗对结果之前的结果的依赖性,要么隐含地假设已知治疗政策,因此排除了该政策未知或需要反事实分析的情况。为了应对这些局限性,我们开发了一种用于治疗和结果的联合模型,该模型允许估计处理策略和顺序治疗(OUT COMECTION数据)的影响。它可以回答有关治疗政策干预措施的介入和反事实查询,因为我们使用有关血糖进展的现实数据显示,并在此基础上进行了模拟研究。
translated by 谷歌翻译
能够从专家推断出第二意见的自动决策支持系统可能有助于更有效地分配资源;他们可以帮助决定何时何地寻求第二意见。在本文中,我们从反事实推断的角度研究了这种类型的支持系统的设计。我们专注于多类分类设置,并首先表明,如果专家自行做出预测,那么产生其预测的基本因果机制就需要满足理想的设定不变属性。此外,我们表明,对于满足该特性的任何因果机制,存在一种等效机制,其中每个专家的预测是由由共同噪声控制的独立亚机制产生的。这激发了设定不变的gumbel-max结构因果模型的设计,其中管理模型的亚机制的噪声结构取决于专家之间相似性的直觉概念,可以从数据估算。合成数据和真实数据的实验表明,我们的模型可用于比其非伴侣对应物更准确地推断第二个意见。
translated by 谷歌翻译
Early on during a pandemic, vaccine availability is limited, requiring prioritisation of different population groups. Evaluating vaccine allocation is therefore a crucial element of pandemics response. In the present work, we develop a model to retrospectively evaluate age-dependent counterfactual vaccine allocation strategies against the COVID-19 pandemic. To estimate the effect of allocation on the expected severe-case incidence, we employ a simulation-assisted causal modelling approach which combines a compartmental infection-dynamics simulation, a coarse-grained, data-driven causal model and literature estimates for immunity waning. We compare Israel's implemented vaccine allocation strategy in 2021 to counterfactual strategies such as no prioritisation, prioritisation of younger age groups or a strict risk-ranked approach; we find that Israel's implemented strategy was indeed highly effective. We also study the marginal impact of increasing vaccine uptake for a given age group and find that increasing vaccinations in the elderly is most effective at preventing severe cases, whereas additional vaccinations for middle-aged groups reduce infections most effectively. Due to its modular structure, our model can easily be adapted to study future pandemics. We demonstrate this flexibility by investigating vaccine allocation strategies for a pandemic with characteristics of the Spanish Flu. Our approach thus helps evaluate vaccination strategies under the complex interplay of core epidemic factors, including age-dependent risk profiles, immunity waning, vaccine availability and spreading rates.
translated by 谷歌翻译
Counterfactuals are often described as 'retrospective,' focusing on hypothetical alternatives to a realized past. This description relates to an often implicit assumption about the structure and stability of exogenous variables in the system being modeled -- an assumption that is reasonable in many settings where counterfactuals are used. In this work, we consider cases where we might reasonably make a different assumption about exogenous variables, namely, that the exogenous noise terms of each unit do exhibit some unit-specific structure and/or stability. This leads us to a different use of counterfactuals -- a 'forward-looking' rather than 'retrospective' counterfactual. We introduce "counterfactual treatment choice," a type of treatment choice problem that motivates using forward-looking counterfactuals. We then explore how mismatches between interventional versus forward-looking counterfactual approaches to treatment choice, consistent with different assumptions about exogenous noise, can lead to counterintuitive results.
translated by 谷歌翻译
为了在结构因果模型(SCM)中执行反事实推理,需要了解因果机制,它提供条件分布的因子,并将噪声映射到样本的确定性函数。遗憾的是,因象无法通过观察和与世界互动收集的数据唯一确定的因果机制,因此仍然存在如何选择因果机制的问题。最近的工作中,Oberst&Sontag(2019)提出了Gumbel-Max SCM,它由于直观上吸引的反事实稳定性而导致Gumbel-Max Reparameterizations作为因果机制。在这项工作中,我们认为选择在估算反事实治疗效果时最小化的定量标准的因果机制,例如最小化方差。我们提出了一个参数化的因果机制,概括了Gumbel-Max。我们表明他们可以接受培训,以最大限度地减少对感兴趣查询的分布的反事实效果方差和其他损失,从而产生比固定替代方案的反复治疗效果的较低方差估计,也推广到在培训时间未见的查询。
translated by 谷歌翻译
这项工作引入了一种新颖的多变量时间点过程,部分均值行为泊松(PMBP)过程,可以利用以将多变量霍克斯过程适合部分间隔删除的数据,该数据包括在尺寸和间隔子集上的事件时间戳的混合中组成的数据。 - 委员会互补尺寸的事件计数。首先,我们通过其条件强度定义PMBP过程,并导出子临界性的规律性条件。我们展示了鹰过程和MBP过程(Rizoiu等人)是PMBP过程的特殊情况。其次,我们提供了能够计算PMBP过程的条件强度和采样事件历史的数字方案。第三,我们通过使用合成和现实世界数据集来证明PMBP过程的适用性:我们测试PMBP过程的能力,以恢复多变量霍克参数给出鹰过程的样本事件历史。接下来,我们在YouTube流行预测任务上评估PMBP过程,并表明它优于当前最先进的鹰强度过程(Rizoiu等人。(2017b))。最后,在Covid19的策划数据集上,关于国家样本的Covid19每日案例计数和Covid19相关的新闻文章,我们展示了PMBP拟合参数上的聚类使各国的分类能够分类案件和新闻的国家级互动报告。
translated by 谷歌翻译
Machine learning can impact people with legal or ethical consequences when it is used to automate decisions in areas such as insurance, lending, hiring, and predictive policing. In many of these scenarios, previous decisions have been made that are unfairly biased against certain subpopulations, for example those of a particular race, gender, or sexual orientation. Since this past data may be biased, machine learning predictors must account for this to avoid perpetuating or creating discriminatory practices. In this paper, we develop a framework for modeling fairness using tools from causal inference. Our definition of counterfactual fairness captures the intuition that a decision is fair towards an individual if it is the same in (a) the actual world and (b) a counterfactual world where the individual belonged to a different demographic group. We demonstrate our framework on a real-world problem of fair prediction of success in law school. * Equal contribution. This work was done while JL was a Research Fellow at the Alan Turing Institute. 2 https://obamawhitehouse.archives.gov/blog/2016/05/04/big-risks-big-opportunities-intersection-big-dataand-civil-rights 31st Conference on Neural Information Processing Systems (NIPS 2017),
translated by 谷歌翻译
反事实推断是一种强大的工具,能够解决备受瞩目的领域中具有挑战性的问题。要进行反事实推断,需要了解潜在的因果机制。但是,仅凭观察和干预措施就不能独特地确定因果机制。这就提出了一个问题,即如何选择因果机制,以便在给定领域中值得信赖。在具有二进制变量的因果模型中已经解决了这个问题,但是分类变量的情况仍未得到解答。我们通过为具有分类变量的因果模型引入反事实排序的概念来应对这一挑战。为了学习满足这些约束的因果机制,并对它们进行反事实推断,我们引入了深层双胞胎网络。这些是深层神经网络,在受过训练的情况下,可以进行双网络反事实推断 - 一种替代绑架,动作和预测方法的替代方法。我们从经验上测试了来自医学,流行病学和金融的多种现实世界和半合成数据的方法,并报告了反事实概率的准确估算,同时证明了反事实订购时不执行反事实的问题。
translated by 谷歌翻译
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select the changes that would have improved the system performance. This work is illustrated by experiments on the ad placement system associated with the Bing search engine.
translated by 谷歌翻译
了解Covid-19的传播是众多研究的主题,突出了可靠的流行模型的重要性。在这里,我们使用带有时间协变量的潜在霍克斯工艺引入了一种新型的流行模型,用于建模感染。与其他模型不同,我们通过基础霍克斯过程驱动的概率分布进行对报告的案例进行建模。通过霍克斯过程对感染进行建模,使我们能够估计受感染的人感染的人。我们提出了一个内核密度颗粒滤波器(KDPF),以推断潜在病例和繁殖数,并在不久的将来预测新病例。计算工作与感染的数量成正比,使使用粒子滤波器类型算法(例如KDPF)成为可能。我们证明了拟议的算法对合成数据集的性能,而Covid-19报告了英国各个地方当局的病例,并将我们的模型基于替代方法。
translated by 谷歌翻译
Strategic test allocation plays a major role in the control of both emerging and existing pandemics (e.g., COVID-19, HIV). Widespread testing supports effective epidemic control by (1) reducing transmission via identifying cases, and (2) tracking outbreak dynamics to inform targeted interventions. However, infectious disease surveillance presents unique statistical challenges. For instance, the true outcome of interest - one's positive infectious status, is often a latent variable. In addition, presence of both network and temporal dependence reduces the data to a single observation. As testing entire populations regularly is neither efficient nor feasible, standard approaches to testing recommend simple rule-based testing strategies (e.g., symptom based, contact tracing), without taking into account individual risk. In this work, we study an adaptive sequential design involving n individuals over a period of {\tau} time-steps, which allows for unspecified dependence among individuals and across time. Our causal target parameter is the mean latent outcome we would have obtained after one time-step, if, starting at time t given the observed past, we had carried out a stochastic intervention that maximizes the outcome under a resource constraint. We propose an Online Super Learner for adaptive sequential surveillance that learns the optimal choice of tests strategies over time while adapting to the current state of the outbreak. Relying on a series of working models, the proposed method learns across samples, through time, or both: based on the underlying (unknown) structure in the data. We present an identification result for the latent outcome in terms of the observed data, and demonstrate the superior performance of the proposed strategy in a simulation modeling a residential university environment during the COVID-19 pandemic.
translated by 谷歌翻译
基于AI和机器学习的决策系统已在各种现实世界中都使用,包括医疗保健,执法,教育和金融。不再是牵强的,即设想一个未来,自治系统将推动整个业务决策,并且更广泛地支持大规模决策基础设施以解决社会最具挑战性的问题。当人类做出决定时,不公平和歧视的问题普遍存在,并且当使用几乎没有透明度,问责制和公平性的机器做出决定时(或可能会放大)。在本文中,我们介绍了\ textit {Causal公平分析}的框架,目的是填补此差距,即理解,建模,并可能解决决策设置中的公平性问题。我们方法的主要见解是将观察到数据中存在的差异的量化与基本且通常是未观察到的因果机制收集的因果机制的收集,这些机制首先会产生差异,挑战我们称之为因果公平的基本问题分析(FPCFA)。为了解决FPCFA,我们研究了分解差异和公平性的经验度量的问题,将这种变化归因于结构机制和人群的不同单位。我们的努力最终达到了公平地图,这是组织和解释文献中不同标准之间关系的首次系统尝试。最后,我们研究了进行因果公平分析并提出一本公平食谱的最低因果假设,该假设使数据科学家能够评估不同影响和不同治疗的存在。
translated by 谷歌翻译
我们基于从多个数据集的合并信息介绍了一种反事实推断的方法。我们考虑了统计边际问题的因果重新重新制定:鉴于边际结构因果模型(SCM)的集合在不同但重叠的变量集上,请确定与边际相反一致的关节SCMS集。我们使用响应函数配方对分类SCM进行了形式化这种方法,并表明它降低了允许的边际和关节SCM的空间。因此,我们的工作通过其他变量突出了一种通过其他变量的新模式,与统计数据相反。
translated by 谷歌翻译
流行病学中的数学模型是一种不可或缺的工具,可以确定传染病的动态和重要特征。除了他们的科学价值之外,这些模型通常用于在正在进行的爆发期间提供政治决策和干预措施。然而,通过将复杂模型连接到真实数据来可靠地推断正在进行的爆发的动态仍然很难,并且需要费力的手动参数拟合或昂贵的优化方法,这些方法必须从划痕中重复给定模型的每个应用。在这项工作中,我们用专门的神经网络的流行病学建模的新组合来解决这个问题。我们的方法需要两个计算阶段:在初始训练阶段中,描述该流行病的数学模型被用作神经网络的教练,该主管是关于全球可能疾病动态的全球知识。在随后的推理阶段,训练有素的神经网络处理实际爆发的观察到的数据,并且揭示了模型的参数,以便实际地再现观察到的动态并可可靠地预测未来的进展。通过其灵活的框架,我们的仿真方法适用于各种流行病学模型。此外,由于我们的方法是完全贝叶斯的,它旨在纳入所有可用的关于合理参数值的先前知识,并返回这些参数上的完整关节后部分布。我们的方法在德国的早期Covid-19爆发阶段的应用表明,我们能够获得可靠的概率估计对重要疾病特征,例如生成时间,未检测到的感染部分,症状发作前的传播可能性,以及报告延迟非常适中的现实观测。
translated by 谷歌翻译
有许多可用于选择优先考虑治疗的可用方法,包括基于治疗效果估计,风险评分和手工制作规则的遵循申请。我们将秩加权平均治疗效应(RATY)指标作为一种简单常见的指标系列,用于比较水平竞争范围的治疗优先级规则。对于如何获得优先级规则,率是不可知的,并且仅根据他们在识别受益于治疗中受益的单位的方式进行评估。我们定义了一系列速率估算器,并证明了一个中央限位定理,可以在各种随机和观测研究环境中实现渐近精确的推断。我们为使用自主置信区间的使用提供了理由,以及用于测试关于治疗效果中的异质性的假设的框架,与优先级规则相关。我们对速率的定义嵌套了许多现有度量,包括QINI系数,以及我们的分析直接产生了这些指标的推论方法。我们展示了我们从个性化医学和营销的示例中的方法。在医疗环境中,使用来自Sprint和Accor-BP随机对照试验的数据,我们发现没有明显的证据证明异质治疗效果。另一方面,在大量的营销审判中,我们在一些数字广告活动的治疗效果中发现了具有的强大证据,并证明了如何使用率如何比较优先考虑估计风险的目标规则与估计治疗效益优先考虑的目标规则。
translated by 谷歌翻译
我们提出了Crisp(COVID-19风险评分预测),这是一种基于SEIR模型的人群传播的COVID-19感染的概率图形模型,我们假设跨时间跨越各种渠道之间的(1)个体之间的相互接触(1)例如,蓝牙接触轨迹)以及(2)在给定时间的测试结果,以进行感染,暴露和免疫测试。我们的微型模型在每个时间点都跟踪每个人的感染状态,从易感性,暴露,感染性到恢复。我们既开发蒙特卡洛EM,又开发传递算法的消息来推断接触通道特定的感染传输概率。鉴于所有接触和测试结果数据的潜在感染状态,我们的蒙特卡洛算法使用gibbs采样在整个分析时间内绘制每个人的潜在感染状态的样本。使用模拟数据的实验结果表明,我们的清晰模型可以通过繁殖因子$ R_0 $参数化,并展示了与经典SEIR模型相似的人群水平的传染性和恢复时间序列。但是,由于单个接触数据,该模型允许精细的粒度控制和推断各种COVID-19减轻和抑制政策度量。此外,Block-GIBBS采样算法能够在测试过程隔离方法中支持有效的测试,以包含COVID-19的感染扩散。据我们所知,这是第一个基于个人水平的接触数据对Covid-19感染有效推断的模型;大多数流行病模型是宏观模型,这些模型在整个人群中推理。 Crisp的实现可在Python和C ++中获得,网址为https://github.com/zalandoresearch/crisp。
translated by 谷歌翻译
We introduce a new probabilistic temporal logic for the verification of Markov Decision Processes (MDP). Our logic is the first to include operators for causal reasoning, allowing us to express interventional and counterfactual queries. Given a path formula $\phi$, an interventional property is concerned with the satisfaction probability of $\phi$ if we apply a particular change $I$ to the MDP (e.g., switching to a different policy); a counterfactual allows us to compute, given an observed MDP path $\tau$, what the outcome of $\phi$ would have been had we applied $I$ in the past. For its ability to reason about different configurations of the MDP, our approach represents a departure from existing probabilistic temporal logics that can only reason about a fixed system configuration. From a syntactic viewpoint, we introduce a generalized counterfactual operator that subsumes both interventional and counterfactual probabilities as well as the traditional probabilistic operator found in e.g., PCTL. From a semantics viewpoint, our logic is interpreted over a structural causal model (SCM) translation of the MDP, which gives us a representation amenable to counterfactual reasoning. We provide a proof-of-concept evaluation of our logic on a reach-avoid task in a grid-world model.
translated by 谷歌翻译
Neyman-Scott过程是COX过程的特殊情况。潜在和可观察的随机过程均为泊松过程。我们考虑了本文的深度Neyman-Scott过程,其中网络的建筑组件是所有泊松过程。我们通过Markov Chain Monte Carlo开发了一种高效的后部抽样,并使用它来实现基于可能性的推断。我们的方法为复杂的分层点流程推断出来的空间。我们在实验中展示了更多隐藏的泊松过程为似然拟合和事件类型预测带来了更好的性能。我们还将我们的方法与最先进的模式进行了用于时间现实世界数据集的方法,并使用较少的参数展示数据拟合和预测的竞争能力。
translated by 谷歌翻译