在这项工作中,我们对基本思想和新颖的发展进行了综述的综述,这是基于最小的假设的一种无创新的,无分配的,非参数预测的方法 - 能够以非常简单的方式预测集屈服在有限样本案例中,在统计意义上也有效。论文中提供的深入讨论涵盖了共形预测的理论基础,然后继续列出原始想法的更高级的发展和改编。
translated by 谷歌翻译
Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability ǫ, together with a method that makes a prediction ŷ of a label y, it produces a set of labels, typically containing ŷ, that also contains y with probability 1 − ǫ. Conformal prediction can be applied to any method for producing ŷ: a nearest-neighbor method, a support-vector machine, ridge regression, etc.Conformal prediction is designed for an on-line setting in which labels are predicted successively, each one being revealed before the next is predicted. The most novel and valuable feature of conformal prediction is that if the successive examples are sampled independently from the same distribution, then the successive predictions will be right 1 − ǫ of the time, even though they are based on an accumulating dataset rather than on independent datasets.In addition to the model under which successive examples are sampled independently, other on-line compression models can also use conformal prediction. The widely used Gaussian linear model is one of these.This tutorial presents a self-contained account of the theory of conformal prediction and works through several numerical examples. A more comprehensive treatment of the topic is provided in Algorithmic Learning in a Random World, by Vladimir Vovk, Alex Gammerman, and Glenn Shafer (Springer, 2005).
translated by 谷歌翻译
We develop a general framework for distribution-free predictive inference in regression, using conformal inference. The proposed methodology allows for the construction of a prediction band for the response variable using any estimator of the regression function. The resulting prediction band preserves the consistency properties of the original estimator under standard assumptions, while guaranteeing finite-sample marginal coverage even when these assumptions do not hold. We analyze and compare, both empirically and theoretically, the two major variants of our conformal framework: full conformal inference and split conformal inference, along with a related jackknife method. These methods offer different tradeoffs between statistical accuracy (length of resulting prediction intervals) and computational efficiency. As extensions, we develop a method for constructing valid in-sample prediction intervals called rank-one-out conformal inference, which has essentially the same computational efficiency as split conformal inference. We also describe an extension of our procedures for producing prediction bands with locally varying length, in order to adapt to heteroskedascity in the data. Finally, we propose a model-free notion of variable importance, called leave-one-covariate-out or LOCO inference. Accompanying this paper is an R package conformalInference that implements all of the proposals we have introduced. In the spirit of reproducibility, all of our empirical results can also be easily (re)generated using this package.
translated by 谷歌翻译
现在通常用于高风险设置,如医疗诊断,如医疗诊断,那么需要不确定量化,以避免后续模型失败。无分发的不确定性量化(无分布UQ)是用户友好的范式,用于为这种预测创建统计上严格的置信区间/集合。批判性地,间隔/集合有效而不进行分布假设或模型假设,即使具有最多许多DataPoints也具有显式保证。此外,它们适应输入的难度;当输入示例很困难时,不确定性间隔/集很大,信号传达模型可能是错误的。在没有多大的工作和没有再培训的情况下,可以在任何潜在的算法(例如神经网络)上使用无分​​发方法,以产生置信度集,以便包含用户指定概率,例如90%。实际上,这些方法易于理解和一般,应用于计算机视觉,自然语言处理,深度加强学习等领域出现的许多现代预测问题。这种实践介绍是针对对无需统计学家的免费UQ的实际实施感兴趣的读者。我们通过实际的理论和无分发UQ的应用领导读者,从保形预测开始,并使无关的任何风险的分布控制,如虚假发现率,假阳性分布检测,等等。我们将包括Python中的许多解释性插图,示例和代码样本,具有Pytorch语法。目标是提供读者对无分配UQ的工作理解,使它们能够将置信间隔放在算法上,其中包含一个自包含的文档。
translated by 谷歌翻译
在过去几十年中,已经提出了各种方法,用于估计回归设置中的预测间隔,包括贝叶斯方法,集合方法,直接间隔估计方法和保形预测方法。重要问题是这些方法的校准:生成的预测间隔应该具有预定义的覆盖水平,而不会过于保守。在这项工作中,我们从概念和实验的角度审查上述四类方法。结果来自各个域的基准数据集突出显示从一个数据集中的性能的大波动。这些观察可能归因于违反某些类别的某些方法所固有的某些假设。我们说明了如何将共形预测用作提供不具有校准步骤的方法的方法的一般校准程序。
translated by 谷歌翻译
对未来观察的预测是一个重要且具有挑战性的问题。分别量化预测不确定性使用预测区域和预测分布的两种主流方法,后者认为更具信息性,因为它可以执行其他与预测相关的任务。有效性的标准概念(我们在这里称为1型有效性)着重于预测区域的覆盖范围,而与预测分布执行的其他与预测相关的任务相关的有效性概念则缺乏。在这里,我们提出了一个新概念,称为2型有效性,与这些其他预测任务有关。我们建立了2型有效性和相干性能之间的联系,并表明为实现它而需要不精确的概率考虑因素。我们继续表明,可以通过将共形预测输出作为辅音合理性度量的轮廓函数来实现两种类型的预测有效性。我们还基于新的非参数推论模型构建提供了保​​形预测的替代表征,其中辅音的出现是自然的,并证明了其有效性。
translated by 谷歌翻译
共形预测(CP)是一种多功能的非参数框架,用于量化预测问题中的不确定性。在这项工作中,我们通过首次提出可以应用于时间不断发展的表面,将这种方法扩展到在双变量域上定义的时间序列函数的情况。为了获得有意义有效的预测区域,CP必须与准确的预测算法结合使用,因此,我们扩展了希尔伯特空间中自回旋过程的理论理论,以允许具有双变量域的功能。考虑到该主题的新颖性,我们提出了功能自回旋模型(FAR)的估计技术。实施了仿真研究,以研究不同的点预测因子如何影响所得的预测频段。最后,我们探索了真正数据集中拟议方法的利益和限制,在过去的二十年中,每天都会观察到黑海的海平面异常。
translated by 谷歌翻译
A flexible method is developed to construct a confidence interval for the frequency of a queried object in a very large data set, based on a much smaller sketch of the data. The approach requires no knowledge of the data distribution or of the details of the sketching algorithm; instead, it constructs provably valid frequentist confidence intervals for random queries using a conformal inference approach. After achieving marginal coverage for random queries under the assumption of data exchangeability, the proposed method is extended to provide stronger inferences accounting for possibly heterogeneous frequencies of different random queries, redundant queries, and distribution shifts. While the presented methods are broadly applicable, this paper focuses on use cases involving the count-min sketch algorithm and a non-linear variation thereof, to facilitate comparison to prior work. In particular, the developed methods are compared empirically to frequentist and Bayesian alternatives, through simulations and experiments with data sets of SARS-CoV-2 DNA sequences and classic English literature.
translated by 谷歌翻译
The notion of uncertainty is of major importance in machine learning and constitutes a key element of machine learning methodology. In line with the statistical tradition, uncertainty has long been perceived as almost synonymous with standard probability and probabilistic predictions. Yet, due to the steadily increasing relevance of machine learning for practical applications and related issues such as safety requirements, new problems and challenges have recently been identified by machine learning scholars, and these problems may call for new methodological developments. In particular, this includes the importance of distinguishing between (at least) two different types of uncertainty, often referred to as aleatoric and epistemic. In this paper, we provide an introduction to the topic of uncertainty in machine learning as well as an overview of attempts so far at handling uncertainty in general and formalizing this distinction in particular.
translated by 谷歌翻译
预测一组结果 - 而不是独特的结果 - 是统计学习中不确定性定量的有前途的解决方案。尽管有关于构建具有统计保证的预测集的丰富文献,但适应未知的协变量转变(实践中普遍存在的问题)还是一个严重的未解决的挑战。在本文中,我们表明具有有限样本覆盖范围保证的预测集是非信息性的,并提出了一种新型的无灵活分配方法PredSet-1Step,以有效地构建了在未知协方差转移下具有渐近覆盖范围保证的预测集。我们正式表明我们的方法是\ textIt {渐近上可能是近似正确},对大型样本的置信度有很好的覆盖误差。我们说明,在南非队列研究中,它在许多实验和有关HIV风险预测的数据集中实现了名义覆盖范围。我们的理论取决于基于一般渐近线性估计器的WALD置信区间覆盖范围的融合率的新结合。
translated by 谷歌翻译
Conformal prediction constructs a confidence set for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations of responses and features. It has a coverage guarantee at any nominal level without additional assumptions on their distribution. Its computation deplorably requires a refitting procedure for all replacement candidates of the target response. In regression settings, this corresponds to an infinite number of model fits. Apart from relatively simple estimators that can be written as pieces of linear function of the response, efficiently computing such sets is difficult, and is still considered as an open problem. We exploit the fact that, \emph{often}, conformal prediction sets are intervals whose boundaries can be efficiently approximated by classical root-finding algorithms. We investigate how this approach can overcome many limitations of formerly used strategies; we discuss its complexity and drawbacks.
translated by 谷歌翻译
We present a new distribution-free conformal prediction algorithm for sequential data (e.g., time series), called the \textit{sequential predictive conformal inference} (\texttt{SPCI}). We specifically account for the nature that the time series data are non-exchangeable, and thus many existing conformal prediction algorithms based on temporal residuals are not applicable. The main idea is to exploit the temporal dependence of conformity scores; thus, the past conformity scores contain information about future ones. Then we cast the problem of conformal prediction interval as predicting the quantile of a future residual, given a prediction algorithm. Theoretically, we establish asymptotic valid conditional coverage upon extending consistency analyses in quantile regression. Using simulation and real-data experiments, we demonstrate a significant reduction in interval width of \texttt{SPCI} compared to other existing methods under the desired empirical coverage.
translated by 谷歌翻译
这篇综述的目的是将读者介绍到图表内,以将其应用于化学信息学中的分类问题。图内核是使我们能够推断分子的化学特性的功能,可以帮助您完成诸如寻找适合药物设计的化合物等任务。内核方法的使用只是一种特殊的两种方式量化了图之间的相似性。我们将讨论限制在这种方法上,尽管近年来已经出现了流行的替代方法,但最著名的是图形神经网络。
translated by 谷歌翻译
本文开发了新型的保形方法,以测试是否从与参考集相同的分布中采样了新的观察结果。以创新的方式将感应性和偏置的共形推断融合,所描述的方法可以以原则性的方式基于已知的分布式数据的依赖侧信息重新权重标准p值,并且可以自动利用最强大的优势来自任何一级和二进制分类器的模型。该解决方案可以通过样品分裂或通过新颖的转置交叉验证+方案来实现,该方案与现有的交叉验证方法相比,由于更严格的保证,这也可能在共形推理的其他应用中有用。在研究错误的发现率控制和在具有几个可能的离群值的多个测试框架内的虚假发现率控制和功率之后,提出的解决方案被证明通过模拟以及用于图像识别和表格数据的应用超过了标准的共形P值。
translated by 谷歌翻译
Many scientific and engineering challenges-ranging from personalized medicine to customized marketing recommendations-require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.
translated by 谷歌翻译
算法稳定性是一种学习理论的概念,其表示对输入数据的改变的程度(例如,删除单个数据点)可能会影响回归算法的输出。了解算法的稳定性属性通常对许多下游应用程序有用 - 例如,已知稳定性导致所需的概括性属性和预测推理保证。然而,目前在实践中使用的许多现代算法太复杂,无法对其稳定性的理论分析,因此我们只能通过算法在各种数据集上的行为的实证探索来尝试建立这些属性。在这项工作中,我们为这种“黑匣子测试”奠定了一个正式的统计框架,而没有任何关于算法或数据分布的假设,并在任何黑匣子测试识别算法稳定性的能力方面建立基本界限。
translated by 谷歌翻译
交叉验证是一种广泛使用的技术来估计预测误差,但其行为很复杂且不完全理解。理想情况下,人们想认为,交叉验证估计手头模型的预测错误,适合训练数据。我们证明,普通最小二乘拟合的线性模型并非如此。相反,它估计模型的平均预测误差适合于同一人群提取的其他看不见的训练集。我们进一步表明,这种现象发生在大多数流行的预测误差估计中,包括数据拆分,自举和锦葵的CP。接下来,从交叉验证得出的预测误差的标准置信区间可能的覆盖范围远低于所需水平。由于每个数据点都用于训练和测试,因此每个折叠的测量精度之间存在相关性,因此方差的通常估计值太小。我们引入了嵌套的交叉验证方案,以更准确地估计该方差,并从经验上表明,在传统的交叉验证间隔失败的许多示例中,这种修改导致间隔大致正确覆盖。
translated by 谷歌翻译
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. However, when the model is misspecified, the test may have poor power, for example when $X$ is involved in complex interactions, or lead to many false rejections. In this work we study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests, to yield both robust error control and high power. The procedure involves using these methods to perform regressions, first to estimate a form of projection of $Y$ on $X$ and $Z$ using one half of the data, and then to estimate the expected conditional covariance between this projection and $Y$ on the remaining half of the data. While the approach is general, we show that a version of our procedure using spline regression achieves what we show is the minimax optimal rate in this nonparametric testing problem. Numerical experiments demonstrate the effectiveness of our approach both in terms of maintaining Type I error control, and power, compared to several existing approaches.
translated by 谷歌翻译
机器学习方法越来越广泛地用于医疗保健,运输和金融等高危环境中。在这些环境中,重要的是,模型要产生校准的不确定性以反映其自信并避免失败。在本文中,我们调查了有关深度学习的不确定性定量(UQ)的最新著作,特别是针对其数学属性和广泛适用性的无分配保形方法。我们将涵盖共形方法的理论保证,引入在时空数据的背景下提高UQ的校准和效率的技术,并讨论UQ在安全决策中的作用。
translated by 谷歌翻译
预测组合在预测社区中蓬勃发展,近年来,已经成为预测研究和活动主流的一部分。现在,由单个(目标)系列产生的多个预测组合通过整合来自不同来源收集的信息,从而提高准确性,从而减轻了识别单个“最佳”预测的风险。组合方案已从没有估计的简单组合方法演变为涉及时间变化的权重,非线性组合,组件之间的相关性和交叉学习的复杂方法。它们包括结合点预测和结合概率预测。本文提供了有关预测组合的广泛文献的最新评论,并参考可用的开源软件实施。我们讨论了各种方法的潜在和局限性,并突出了这些思想如何随着时间的推移而发展。还调查了有关预测组合实用性的一些重要问题。最后,我们以当前的研究差距和未来研究的潜在见解得出结论。
translated by 谷歌翻译