智能论文笔记

CODA: Calibrated Optimal Decision Making with Multiple Data Sources and Limited Outcome

Hengrui Cai , Wenbin Lu , Rui Song

分类： (统计)机器学习

2021-04-21

我们考虑在具有多个可用的多个辅助来源的主要兴趣样本中最佳决策问题。感兴趣的结果是有限的，因为它仅在主要样本中观察到。实际上，这种多个数据源可能属于异质研究，因此不能直接组合。本文提出了一种新的框架来处理异构研究，并通过新的校准最佳决策（CODA）方法同时解决有限的结果，通过利用多种数据来源的常见中间结果来解决。具体地，CODA允许跨不同样品的基线协变量具有均匀或异质的分布。在温和和可测试的假设下，不同样本中的中间结果的条件方法等于基线协变量和治疗信息，我们表明，条件平均结果的提议CODA估计是渐近正常的和更有效的，而不是使用主要样品。此外，由于速率双重稳健性，可以使用简单的插件方法轻松获得CODA估计器的方差。对模拟数据集的广泛实验显示了使用CoDa的经验有效性和提高效率，然后是与来自Eicu的辅助数据的主要样本是MIMIC-III数据集的真实应用程序。

translated by 谷歌翻译

Jump Interval-Learning for Individualized Decision Making

Hengrui Cai , Chengchun Shi , Rui Song , Wenbin Lu

分类：机器学习 | (统计)机器学习

2021-11-17

个性化决定规则（IDR）是一个决定函数，可根据他/她观察到的特征分配给定的治疗。文献中的大多数现有工作考虑使用二进制或有限的许多治疗方案的设置。在本文中，我们专注于连续治疗设定，并提出跳跃间隔 - 学习，开发一个最大化预期结果的个性化间隔值决定规则（I2DR）。与推荐单一治疗的IDRS不同，所提出的I2DR为每个人产生了一系列治疗方案，使其在实践中实施更加灵活。为了获得最佳I2DR，我们的跳跃间隔学习方法估计通过跳转惩罚回归给予治疗和协变量的结果的条件平均值，并基于估计的结果回归函数来衍生相应的最佳I2DR。允许回归线是用于清晰的解释或深神经网络的线性，以模拟复杂的处理 - 协调会相互作用。为了实现跳跃间隔学习，我们开发了一种基于动态编程的搜索算法，其有效计算结果回归函数。当结果回归函数是处理空间的分段或连续功能时，建立所得I2DR的统计特性。我们进一步制定了一个程序，以推断（估计）最佳政策下的平均结果。进行广泛的模拟和对华法林研究的真实数据应用，以证明所提出的I2DR的经验有效性。

translated by 谷歌翻译

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

Hengrui Cai , Wenbin Lu , Rachel Marceau West , Devan V. Mehrotra , Lingkang Huang

分类： (统计)机器学习 | 机器学习

2021-10-11

个性化医学是针对患者特征量身定制的医学范式，是医疗保健中越来越有吸引力的领域。个性化医学的一个重要目标是根据基线协变量鉴定患者的亚组，而与其他比较治疗相比，从目标治疗中受益更多。当前的大多数亚组识别方法仅着重于获得具有增强治疗效果的亚组，而无需注意亚组大小。但是，临床上有意义的亚组学习方法应确定可以从更好的治疗中受益的患者数量的最大数量。在本文中，我们提出了一项最佳的亚组选择规则（SSR），该规则最大化选定的患者的数量，同时，达到了预先指定的临床意义上有意义的平均结果，例如平均治疗效果。我们基于描述结果中的处理 - 果膜相互作用的对比函数，得出了最佳SSR的两种等效理论形式。我们进一步提出了一个受约束的策略树搜索算法（资本），以在可解释的决策树类中找到最佳SSR。所提出的方法是灵活的，可以处理多种限制因素，以惩罚具有负面治疗效果的患者，并使用受限的平均生存时间作为临床上有趣的平均结果来解决事件数据的时间。进行了广泛的模拟，比较研究和实际数据应用，以证明我们方法的有效性和实用性。

translated by 谷歌翻译

Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online Learning

Ye Shen , Hengrui Cai , Rui Song

分类： (统计)机器学习 | 机器学习

2021-10-29

Evaluating the performance of an ongoing policy plays a vital role in many areas such as medicine and economics, to provide crucial instruction on the early-stop of the online experiment and timely feedback from the environment. Policy evaluation in online learning thus attracts increasing attention by inferring the mean outcome of the optimal policy (i.e., the value) in real-time. Yet, such a problem is particularly challenging due to the dependent data generated in the online environment, the unknown optimal policy, and the complex exploration and exploitation trade-off in the adaptive experiment. In this paper, we aim to overcome these difficulties in policy evaluation for online learning. We explicitly derive the probability of exploration that quantifies the probability of exploring the non-optimal actions under commonly used bandit algorithms. We use this probability to conduct valid inference on the online conditional mean estimator under each action and develop the doubly robust interval estimation (DREAM) method to infer the value under the estimated optimal policy in online learning. The proposed value estimator provides double protection on the consistency and is asymptotically normal with a Wald-type confidence interval provided. Extensive simulations and real data applications are conducted to demonstrate the empirical validity of the proposed DREAM method.

translated by 谷歌翻译

Quantile Off-Policy Evaluation via Deep Conditional Generative Learning

Yang Xu , Chengchun Shi , Shikai Luo , Lan Wang , Rui Song

分类： (统计)机器学习 | 机器学习

2022-12-29

Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline data generated by a potentially different behavior policy. It is critical in a number of sequential decision making problems ranging from healthcare to technology industries. Most of the work in existing literature is focused on evaluating the mean outcome of a given policy, and ignores the variability of the outcome. However, in a variety of applications, criteria other than the mean may be more sensible. For example, when the reward distribution is skewed and asymmetric, quantile-based metrics are often preferred for their robustness. In this paper, we propose a doubly-robust inference procedure for quantile OPE in sequential decision making and study its asymptotic properties. In particular, we propose utilizing state-of-the-art deep conditional generative learning methods to handle parameter-dependent nuisance function estimation. We demonstrate the advantages of this proposed estimator through both simulations and a real-world dataset from a short-video platform. In particular, we find that our proposed estimator outperforms classical OPE estimators for the mean in settings with heavy-tailed reward distributions.

translated by 谷歌翻译

Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework

Chengchun Shi , Xiaoyu Wang , Shikai Luo , Hongtu Zhu , Jieping Ye , Rui Song

分类：机器学习 | (统计)机器学习

2020-02-05

A / B测试或在线实验是一种标准的业务策略，可以在制药，技术和传统行业中与旧产品进行比较。在双面市场平台（例如优步）的在线实验中出现了主要挑战，其中只有一个单位接受一系列处理随着时间的推移。在这些实验中，给定时间的治疗会影响当前结果以及未来的结果。本文的目的是引入用于在这些实验中携带A / B测试的加强学习框架，同时表征长期治疗效果。我们所提出的测试程序允许顺序监控和在线更新。它通常适用于不同行业的各种治疗设计。此外，我们系统地研究了我们测试程序的理论特性（例如，尺寸和功率）。最后，我们将框架应用于模拟数据和从技术公司获得的真实数据示例，以说明其在目前的实践中的优势。我们的测试的Python实现是在https://github.com/callmespring/causalrl上找到的。

translated by 谷歌翻译

Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons

Chengchun Shi , Shikai Luo , Yuan Le , Hongtu Zhu , Rui Song

分类： (统计)机器学习 | 机器学习

2022-02-26

我们考虑在离线域中的强化学习（RL）方法，没有其他在线数据收集，例如移动健康应用程序。计算机科学文献中的大多数现有策略优化算法都是在易于收集或模拟的在线设置中开发的。通过预采用的离线数据集，它们对移动健康应用程序的概括尚不清楚。本文的目的是开发一个新颖的优势学习框架，以便有效地使用预采用的数据进行策略优化。所提出的方法采用由任何现有的最新RL算法计算的最佳Q-估计器作为输入，并输出一项新策略，其价值比基于初始Q-得出的策略更快地收敛速度。估计器。进行广泛的数值实验以支持我们的理论发现。我们提出的方法的Python实现可在https://github.com/leyuanheart/seal上获得。

translated by 谷歌翻译

Distribution-free Prediction Sets Adaptive to Unknown Covariate Shift

Hongxiang Qiu , Edgar Dobriban , Eric Tchetgen Tchetgen

分类： (统计)机器学习

2022-03-11

预测一组结果 - 而不是独特的结果 - 是统计学习中不确定性定量的有前途的解决方案。尽管有关于构建具有统计保证的预测集的丰富文献，但适应未知的协变量转变（实践中普遍存在的问题）还是一个严重的未解决的挑战。在本文中，我们表明具有有限样本覆盖范围保证的预测集是非信息性的，并提出了一种新型的无灵活分配方法PredSet-1Step，以有效地构建了在未知协方差转移下具有渐近覆盖范围保证的预测集。我们正式表明我们的方法是\ textIt {渐近上可能是近似正确}，对大型样本的置信度有很好的覆盖误差。我们说明，在南非队列研究中，它在许多实验和有关HIV风险预测的数据集中实现了名义覆盖范围。我们的理论取决于基于一般渐近线性估计器的WALD置信区间覆盖范围的融合率的新结合。

translated by 谷歌翻译

Algorithm is Experiment: Machine Learning, Market Design, and Policy Eligibility Rules

Yusuke Narita , Kohei Yata

分类：机器学习 | (统计)机器学习

2021-04-26

算法在政策和业务中产生越来越多的决策和建议。这种算法决策是自然实验（可条件准随机分配的仪器），因为该算法仅基于可观察输入变量的决定。我们使用该观察来为一类随机和确定性决策算法开发治疗效果估算器。我们的估算器被证明对于明确的因果效应，它们是一致的和渐近正常的。我们估算器的一个关键特例是多维回归不连续性设计。我们应用估算员以评估冠状病毒援助，救济和经济安全（关心）法案的效果，其中数十亿美元的资金通过算法规则分配给医院。我们的估计表明，救济资金对Covid-19相关的医院活动水平影响不大。天真的OLS和IV估计表现出实质性的选择偏差。

translated by 谷歌翻译

On the role of surrogates in the efficient estimation of treatment effects with limited outcome data

Nathan Kallus , Xiaojie Mao

分类： (统计)机器学习 | 机器学习

2020-03-27

In many investigations, the primary outcome of interest is difficult or expensive to collect. Examples include long-term health effects of medical interventions, measurements requiring expensive testing or follow-up, and outcomes only measurable on small panels as in marketing. This reduces effective sample sizes for estimating the average treatment effect (ATE). However, there is often an abundance of observations on surrogate outcomes not of primary interest, such as short-term health effects or online-ad click-through. We study the role of such surrogate observations in the efficient estimation of treatment effects. To quantify their value, we derive the semiparametric efficiency bounds on ATE estimation with and without the presence of surrogates and several intermediary settings. The difference between these characterizes the efficiency gains from optimally leveraging surrogates. We study two regimes: when the number of surrogate observations is comparable to primary-outcome observations and when the former dominates the latter. We take an agnostic missing-data approach circumventing strong surrogate conditions previously assumed. To leverage surrogates' efficiency gains, we develop efficient ATE estimation and inference based on flexible machine-learning estimates of nuisance functions appearing in the influence functions we derive. We empirically demonstrate the gains by studying the long-term earnings effect of job training.

translated by 谷歌翻译

Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process

Chengchun Shi , Jin Zhu , Ye Shen , Shikai Luo , Hongtu Zhu , Rui Song

分类： (统计)机器学习 | 机器学习

2022-02-22

本文关注的是，基于无限视野设置中预采用的观察数据，为目标策略的价值离线构建置信区间。大多数现有作品都假定不存在混淆观察到的动作的未测量变量。但是，在医疗保健和技术行业等实际应用中，这种假设可能会违反。在本文中，我们表明，使用一些辅助变量介导动作对系统动态的影响，目标策略的价值在混杂的马尔可夫决策过程中可以识别。基于此结果，我们开发了一个有效的非政策值估计器，该估计值可用于潜在模型错误指定并提供严格的不确定性定量。我们的方法是通过理论结果，从乘车共享公司获得的模拟和真实数据集证明的。python实施了建议的过程，请访问https://github.com/mamba413/cope。

translated by 谷歌翻译

Evaluating Treatment Prioritization Rules via Rank-Weighted Average Treatment Effects

Steve Yadlowsky , Scott Fleming , Nigam Shah , Emma Brunskill , Stefan Wager

分类： (统计)机器学习

2021-11-15

有许多可用于选择优先考虑治疗的可用方法，包括基于治疗效果估计，风险评分和手工制作规则的遵循申请。我们将秩加权平均治疗效应（RATY）指标作为一种简单常见的指标系列，用于比较水平竞争范围的治疗优先级规则。对于如何获得优先级规则，率是不可知的，并且仅根据他们在识别受益于治疗中受益的单位的方式进行评估。我们定义了一系列速率估算器，并证明了一个中央限位定理，可以在各种随机和观测研究环境中实现渐近精确的推断。我们为使用自主置信区间的使用提供了理由，以及用于测试关于治疗效果中的异质性的假设的框架，与优先级规则相关。我们对速率的定义嵌套了许多现有度量，包括QINI系数，以及我们的分析直接产生了这些指标的推论方法。我们展示了我们从个性化医学和营销的示例中的方法。在医疗环境中，使用来自Sprint和Accor-BP随机对照试验的数据，我们发现没有明显的证据证明异质治疗效果。另一方面，在大量的营销审判中，我们在一些数字广告活动的治疗效果中发现了具有的强大证据，并证明了如何使用率如何比较优先考虑估计风险的目标规则与估计治疗效益优先考虑的目标规则。

translated by 谷歌翻译

Orthogonal Series Estimation for the Ratio of Conditional Expectation Functions

Kazuhiko Shinoda , Takahiro Hoshino

分类： (统计)机器学习

2022-12-26

In various fields of data science, researchers are often interested in estimating the ratio of conditional expectation functions (CEFR). Specifically in causal inference problems, it is sometimes natural to consider ratio-based treatment effects, such as odds ratios and hazard ratios, and even difference-based treatment effects are identified as CEFR in some empirically relevant settings. This chapter develops the general framework for estimation and inference on CEFR, which allows the use of flexible machine learning for infinite-dimensional nuisance parameters. In the first stage of the framework, the orthogonal signals are constructed using debiased machine learning techniques to mitigate the negative impacts of the regularization bias in the nuisance estimates on the target estimates. The signals are then combined with a novel series estimator tailored for CEFR. We derive the pointwise and uniform asymptotic results for estimation and inference on CEFR, including the validity of the Gaussian bootstrap, and provide low-level sufficient conditions to apply the proposed framework to some specific examples. We demonstrate the finite-sample performance of the series estimator constructed under the proposed framework by numerical simulations. Finally, we apply the proposed method to estimate the causal effect of the 401(k) program on household assets.

translated by 谷歌翻译

Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings

Hengrui Cai , Chengchun Shi , Rui Song , Wenbin Lu

分类： (统计)机器学习 | 机器学习

2020-10-29

我们认为离政策在连续处理设置，如个性化的剂量调查评价（OPE）。在OPE，一个目标来估算下使用不同的决策规则产生的历史数据的新的治疗决策规则中的平均结果。离散处理设置上OPE焦点大多数现有的作品。为了应对持续的治疗，我们开发使用OPE深跳学习一种新的估计方法。我们的方法在于在使用深离散化，通过利用深度学习和多尺度变化点检测自适应离散化治疗领域的主要成分。这使我们能够应用在离散处理现有OPE方法来处理连续治疗。我们的方法是通过理论计算结果，模拟和实际应用程序，以华法林给药进一步合理的。

translated by 谷歌翻译

Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding

Zhengling Qi , Rui Miao , Xiaoke Zhang

分类：机器学习 | (统计)机器学习

2021-05-03

数据驱动的个性化决策最近收到了增加的研究兴趣。大多数现有的方法都取决于没有无法衡量的混杂的假设，不幸的是，在实践中，尤其是在观察性研究中无法确保这种混杂。在最近提出的近端因果推理的推动下，我们开发了几种近端学习方法，以估算未衡量的混杂的最佳个性化治疗方案（ITR）。特别是，我们为不同类别的ITR建立了几个识别结果，这表现出了做出不可测试的假设的风险与决策的价值函数改善之间的权衡。基于这些结果，我们提出了几种基于分类的方法来找到各种限制的课堂最佳ITR并发展其理论属性。通过广泛的仿真研究和一项真实的数据应用，我们提出的方法的数值性能具有吸引力。

translated by 谷歌翻译

Federated Causal Inference in Heterogeneous Observational Data

Ruoxuan Xiong , Allison Koenecke , Michael Powell , Zhu Shen , Joshua T. Vogelstein , Susan Athey

分类：机器学习

2021-07-25

We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site. Due to privacy constraints, individual-level data cannot be shared across sites; the sites may also have heterogeneous populations and treatment assignment mechanisms. Motivated by these considerations, we develop federated methods to draw inference on the average treatment effects of combined data across sites. Our methods first compute summary statistics locally using propensity scores and then aggregate these statistics across sites to obtain point and variance estimators of average treatment effects. We show that these estimators are consistent and asymptotically normal. To achieve these asymptotic properties, we find that the aggregation schemes need to account for the heterogeneity in treatment assignments and in outcomes across sites. We demonstrate the validity of our federated methods through a comparative study of two large medical claims databases.

translated by 谷歌翻译

Statistical Inference for Maximin Effects: Identifying Stable Associations across Multiple Studies

Zijian Guo

分类： (统计)机器学习

2020-11-15

Integrative analysis of data from multiple sources is critical to making generalizable discoveries. Associations that are consistently observed across multiple source populations are more likely to be generalized to target populations with possible distributional shifts. In this paper, we model the heterogeneous multi-source data with multiple high-dimensional regressions and make inferences for the maximin effect (Meinshausen, B{\"u}hlmann, AoS, 43(4), 1801--1830). The maximin effect provides a measure of stable associations across multi-source data. A significant maximin effect indicates that a variable has commonly shared effects across multiple source populations, and these shared effects may be generalized to a broader set of target populations. There are challenges associated with inferring maximin effects because its point estimator can have a non-standard limiting distribution. We devise a novel sampling method to construct valid confidence intervals for maximin effects. The proposed confidence interval attains a parametric length. This sampling procedure and the related theoretical analysis are of independent interest for solving other non-standard inference problems. Using genetic data on yeast growth in multiple environments, we demonstrate that the genetic variants with significant maximin effects have generalizable effects under new environments.

translated by 谷歌翻译

Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score

Alejandro Schuler , David Walsh , Diana Hall , Jon Walsh , Charles Fisher

分类： (统计)机器学习 | 机器学习

2020-12-17

估算随机实验的因果效应是临床研究的核心。降低这些分析中的统计不确定性是统计学家的重要目标。注册管理机构，事先审判和健康记录构成了对患者的历史数据汇编，其在可能是可利用至此的患者下的历史数据。但是，大多数历史借贷方法通过牺牲严格的I型错误率控制来达到方差的减少。在这里，我们建议使用利用线性协变调整的历史数据来提高试验分析的效率而不会产生偏见。具体而言，我们在历史数据上培训预后模型，然后使用线性回归估计治疗效果，同时调整试验受试者预测结果（其预后分数）。我们证明，在某些条件下，这种预后调整程序在大类估算仪中获得了最低差异。当不符合这些条件时，预后的协变量调整仍然比原始协变量调整更有效，并且效率的增益与上述预后模型的预测准确性的衡量标准成正比，与原始协变量的线性关系的预测准确性。我们展示了使用模拟的方法和阿尔茨海默病的临床试验的再分析，并观察平均平均误差的有意义减少和估计方差。最后，我们提供了一种简化的渐近方差公式，使得能够计算这些收益的功率计算。在使用预后模型的预后模型中，可以实现10％和30％的样品尺寸减少。

translated by 谷歌翻译

A General Framework for Treatment Effect Estimation in Semi-Supervised and High Dimensional Settings

Abhishek Chakrabortty , Guorong Dai , Eric Tchetgen Tchetgen

分类： (统计)机器学习

2022-01-03

在本文中，我们的目标是提供对半监督（SS）因果推理的一般性和完全理解治疗效果。具体而言，我们考虑两个这样的估计值：（a）平均治疗效果和（b）定量处理效果，作为原型案例，在SS设置中，其特征在于两个可用的数据集：（i）标记的数据集大小$ N $，为响应和一组高维协变量以及二元治疗指标提供观察。（ii）一个未标记的数据集，大小超过$ n $，但未观察到的响应。使用这两个数据集，我们开发了一个SS估计系列，该系列是：（1）更强大，并且（2）比其监督对应力更高的基于标记的数据集。除了通过监督方法可以实现的“标准”双重稳健结果（在一致性方面），我们还在正确指定模型中的倾向得分，我们进一步建立了我们SS估计的根本-N一致性和渐近常态。没有需要涉及的特定形式的滋扰职能。这种改善的鲁棒性来自使用大规模未标记的数据，因此通常不能在纯粹监督的环境中获得。此外，只要正确指定所有滋扰函数，我们的估计值都显示为半参数效率。此外，作为滋扰估计器的说明，我们考虑逆概率加权型核平滑估计，涉及未知的协变量转换机制，并在高维情景新颖的情况下建立其统一的收敛速率，这应该是独立的兴趣。两种模拟和实际数据的数值结果验证了我们对其监督对应物的优势，了解鲁棒性和效率。

translated by 谷歌翻译

Machine Learning for Variance Reduction in Online Experiments

Yongyi Guo , Dominic Coey , Mikael Konutgan , Wenting Li , Chris Schoener , Matt Goldman

分类： (统计)机器学习 | 机器学习

2021-06-14

我们考虑随机对照试验的差异问题，通过使用与结果相关的协变量但与治疗无关。我们提出了一种机器学习回归调整的处理效果估算器，我们称之为Mlrate。 Mlrate使用机器学习预测结果来降低估计方差。它采用交叉配件来避免过度偏置，在一般条件下，我们证明了一致性和渐近正常性。 Mlrate对机器学习的预测较差的鲁棒步骤：如果预测与结果不相关，则估计器执行渐近的差异，而不是标准差异估计器，而如果预测与结果高度相关，则效率提升大。在A / A测试中，对于在Facebook实验中通常监测的一组48个结果指标，估计器的差异比简单差分估计器差异超过70％，比仅调整的共同单变量过程约19％用于结果的预测值。

translated by 谷歌翻译