智能论文笔记

Inference for BART with Multinomial Outcomes

Yizhen Xu , Joseph W. Hogan , Michael J. Daniels , Rami Kantor , Ann Mwangi

分类：机器学习 | (统计)机器学习

2021-01-18

多项式概率贝叶斯添加剂回归树（MPBART）框架是由Kindo等人提出的。（KD），与BART的多项式概率（MNP）模型中的潜在实用程序近似（Chipman等人，2010年）。与多项式逻辑模型相比，MNP不假定独立的替代方案，并且可以通过多元高斯分布式潜在实用程序指定替代方案之间的相关结构。我们介绍了两种新算法，以拟合MPBART，并表明我们的提案的理论混合速率相等或优于KD中现有的算法。通过模拟，我们探讨了方法对参考水平的选择，结果频率的不平衡以及实用程序误差项的先前超参数的规格。这项工作是由基于电子健康记录（EHR）从肯尼亚提供医疗保健（AMPATH）的学术模型中的电子健康记录（EHR）来实现后验预测分布来在HIV阳性患者中进行护理的后验预测分配的动机。在应用程序和模拟中，与KD相比，在MCMC收敛速率和后验预测精度方面，我们使用建议的性能更好。

translated by 谷歌翻译

Conjugate priors for count and rounded data regression

Daniel R. Kowal

分类： (统计)机器学习

2021-10-23

离散数据丰富，并且通常作为计数或圆形数据而出现。甚至对于线性回归模型，缀合格前沿和闭合形式的后部通常是不可用的，这需要近似诸如MCMC的后部推理。对于广泛的计数和圆形数据回归模型，我们介绍了能够闭合后部推理的共轭前沿。密钥后和预测功能可通过直接蒙特卡罗模拟来计算。至关重要的是，预测分布是离散的，以匹配数据的支持，并且可以在多个协变量中进行共同评估或模拟。这些工具广泛用途是线性回归，非线性模型，通过基础扩展，以及模型和变量选择。多种仿真研究表明计算，预测性建模和相对于现有替代方案的选择性的显着优势。

translated by 谷歌翻译

Flexible Bayesian Nonlinear Model Configuration

Aliaksandr Hubin , Geir Storvik , Florian Frommlet

分类： (统计)机器学习 | 机器学习

2020-03-05

回归模型用于各种应用，为来自不同领域的研究人员提供强大的科学工具。线性或简单的参数，模型通常不足以描述输入变量与响应之间的复杂关系。通过诸如神经网络的灵活方法可以更好地描述这种关系，但这导致不太可解释的模型和潜在的过度装备。或者，可以使用特定的参数非线性函数，但是这种功能的规范通常是复杂的。在本文中，我们介绍了一种灵活的施工方法，高度灵活的非线性参数回归模型。非线性特征是分层的，类似于深度学习，但对要考虑的可能类型的功能具有额外的灵活性。这种灵活性，与变量选择相结合，使我们能够找到一小部分重要特征，从而可以更具可解释的模型。在可能的功能的空间内，考虑了贝叶斯方法，基于它们的复杂性引入功能的前沿。采用遗传修改模式跳跃马尔可夫链蒙特卡罗算法来执行贝叶斯推理和估计模型平均的后验概率。在各种应用中，我们说明了我们的方法如何用于获得有意义的非线性模型。此外，我们将其预测性能与多个机器学习算法进行比较。

translated by 谷歌翻译

Faster MCMC for Gaussian Latent Position Network Models

Neil A. Spencer , Brian Junker , Tracy M. Sweet

分类： (统计)机器学习

2020-06-13

潜在位置网络模型是网络科学的多功能工具;应用程序包括集群实体，控制因果混淆，并在未观察的图形上定义前提。估计每个节点的潜在位置通常是贝叶斯推理问题的群体，吉布斯内的大都市是最流行的近似后分布的工具。然而，众所周知，GIBBS内的大都市对于大型网络而言是低效;接受比计算成本昂贵，并且所得到的后绘高度相关。在本文中，我们提出了一个替代的马尔可夫链蒙特卡罗战略 - 使用分裂哈密顿蒙特卡罗和萤火虫蒙特卡罗的组合定义 - 利用后部分布的功能形式进行更有效的后退计算。我们展示了这些战略在吉布斯和综合网络上的其他算法中优于大都市，以及学区的教师和工作人员的真正信息共享网络。

translated by 谷歌翻译

Shrinkage Bayesian Causal Forests for Heterogeneous Treatment Effects Estimation

Alberto Caron , Gianluca Baio , Ioanna Manolopoulou

分类：机器学习 | (统计)机器学习

2021-02-12

本文开发了贝叶斯因果林的稀疏诱导版本，最近提出的非参数因果回归模型采用贝叶斯添加剂回归树，专门设计用于使用观察数据来估计异质治疗效果。我们介绍的稀疏诱导组件是通过实证研究的动机，其中不是所有可用的协变量相关的，导致在估计个体治疗效果的兴趣表面底层的不同程度。在这项工作中提供的扩展版本，我们命名贝叶斯因果森林，配备了一对允许模型通过树集合中的相应数量的分裂调节每个协变量的重量。这些前瞻改善了模型对稀疏数据产生过程的适应性，并且允许在治疗效果估计的框架中进行完全贝叶斯特征缩收，从而揭示推动异质性的调节因子。此外，该方法允许先前了解相关的混杂协变量和对模型中掺入结果的影响的相对幅度。我们说明了我们在模拟研究中的方法的表现，与贝叶斯因果林和其他最先进的模型相比，展示如何与越来越多的协变量以及其如何处理强烈混淆的情景。最后，我们还提供了使用真实数据的应用程序的示例。

translated by 谷歌翻译

An Introduction to Modern Statistical Learning

Joseph G. Makin

分类：机器学习

2022-07-20

这项正在进行的工作旨在为统计学习提供统一的介绍，从诸如GMM和HMM等经典模型到现代神经网络（如VAE和扩散模型）缓慢地构建。如今，有许多互联网资源可以孤立地解释这一点或新的机器学习算法，但是它们并没有（也不能在如此简短的空间中）将这些算法彼此连接起来，或者与统计模型的经典文献相连现代算法出现了。同样明显缺乏的是一个单一的符号系统，尽管对那些已经熟悉材料的人（如这些帖子的作者）不满意，但对新手的入境造成了重大障碍。同样，我的目的是将各种模型（尽可能）吸收到一个用于推理和学习的框架上，表明（以及为什么）如何以最小的变化将一个模型更改为另一个模型（其中一些是新颖的，另一些是文献中的）。某些背景当然是必要的。我以为读者熟悉基本的多变量计算，概率和统计以及线性代数。这本书的目标当然不是完整性，而是从基本知识到过去十年中极强大的新模型的直线路径或多或少。然后，目标是补充而不是替换，诸如Bishop的\ emph {模式识别和机器学习}之类的综合文本，该文本现在已经15岁了。

translated by 谷歌翻译

GP-BART: a novel Bayesian additive regression trees approach using Gaussian processes

Mateus Maia , Keefe Murphy , Andrew C. Parnell

分类：机器学习 | (统计)机器学习

2022-04-05

The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines "weak" tree models through a set of shrinkage priors, whereby each tree explains a small portion of the variability in the data. However, the lack of smoothness and the absence of a covariance structure over the observations in standard BART can yield poor performance in cases where such assumptions would be necessary. We propose Gaussian processes Bayesian additive regression trees (GP-BART) as an extension of BART which assumes Gaussian process (GP) priors for the predictions of each terminal node among all trees. We illustrate our model on simulated and real data and compare its performance to traditional modelling approaches, outperforming them in many scenarios. An implementation of our method is available in the R package rGPBART available at: https://github.com/MateusMaiaDS/gpbart

translated by 谷歌翻译

Bayesian Probabilistic Numerical Integration with Tree-Based Models

Harrison Zhu , Xing Liu , Ruya Kang , Zhichao Shen , Seth Flaxman , François-Xavier Briol

分类：机器学习 | (统计)机器学习

2020-06-09

贝叶斯正交（BQ）是一种解决贝叶斯方式中数值集成问题的方法，允许用户量化其对解决方案的不确定性。 BQ的标准方法基于Intains的高斯过程（GP）近似。结果，BQ本质上仅限于可以以有效的方式完成GP近似的情况，因此通常禁止非常高维或非平滑的目标功能。本文提出使用基于贝叶斯添加剂回归树（BART）前锋的新的贝叶斯数值集成算法来解决这个问题，我们调用Bart-Int。 BART Priors易于调整，适合不连续的功能。我们证明它们在顺序设计环境中，它们也会自然地借给自己，并且可以在各种设置中获得显式收敛速率。这种新方法的优点和缺点在包括Genz功能的一组基准测试和贝叶斯调查设计问题上突出显示。

translated by 谷歌翻译

Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of Multimodal Posteriors

Yuling Yao , Aki Vehtari , Andrew Gelman

分类： (统计)机器学习

2020-06-22

在使用多模式贝叶斯后部分布时，马尔可夫链蒙特卡罗（MCMC）算法难以在模式之间移动，并且默认变分或基于模式的近似推动将低估后不确定性。并且，即使找到最重要的模式，难以评估后部的相对重量。在这里，我们提出了一种使用MCMC，变分或基于模式的模式的并行运行的方法，以便尽可能多地击中多种模式或分离的区域，然后使用贝叶斯堆叠来组合这些用于构建分布的加权平均值的可扩展方法。通过堆叠从多模式后分布的堆叠，最小化交叉验证预测误差的结果，并且代表了比变分推断更好的不确定度，但它不一定是相当于渐近的，以完全贝叶斯推断。我们呈现理论一致性，其中堆叠推断逼近来自未衰退的模型和非混合采样器的真实数据生成过程，预测性能优于完全贝叶斯推断，因此可以被视为祝福而不是模型拼写下的诅咒。我们展示了几个模型家庭的实际实施：潜在的Dirichlet分配，高斯过程回归，分层回归，马蹄素变量选择和神经网络。

translated by 谷歌翻译

Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data

Jin-Zhu Yu , Hiba Baroud

分类：机器学习

2022-11-28

Hierarchical Bayesian Poisson regression models (HBPRMs) provide a flexible modeling approach of the relationship between predictors and count response variables. The applications of HBPRMs to large-scale datasets require efficient inference algorithms due to the high computational cost of inferring many model parameters based on random sampling. Although Markov Chain Monte Carlo (MCMC) algorithms have been widely used for Bayesian inference, sampling using this class of algorithms is time-consuming for applications with large-scale data and time-sensitive decision-making, partially due to the non-conjugacy of many models. To overcome this limitation, this research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy. In the proposed sampler, the data likelihood is approximated with Gaussian distribution such that the conditional posterior of the coefficients has a closed-form solution. Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS in comparison to the state-of-the-art sampling algorithm, especially for large datasets.

translated by 谷歌翻译

A Two-step Metropolis Hastings Method for Bayesian Empirical Likelihood Computation with Application to Bayesian Model Selection

Sanjay Chaudhuri , Teng Yin

分类： (统计)机器学习

2022-09-02

最近，经验可能性已在贝叶斯框架下广泛应用。马尔可夫链蒙特卡洛（MCMC）方法经常用于从感兴趣参数的后验分布中采样。然而，可能性支持的复杂性，尤其是非凸性的性质，在选择适当的MCMC算法时建立了巨大的障碍。这种困难限制了在许多应用中基于贝叶斯的经验可能性（贝叶赛）方法的使用。在本文中，我们提出了一个两步的大都会黑斯廷斯算法，以从贝耶斯后期进行采样。我们的建议是在层次上指定的，其中确定经验可能性的估计方程用于根据其余参数的建议值提出一组参数的值。此外，我们使用经验可能性讨论贝叶斯模型的选择，并将我们的两步大都会黑斯廷斯算法扩展到可逆的跳跃马尔可夫链蒙特卡洛手术程序，以便从最终的后验中采样。最后，提出了我们提出的方法的几种应用。

translated by 谷歌翻译

Marginal likelihood computation for model selection and hypothesis testing: an extensive review

Fernando Llorente , Luca Martino , David Delgado , Javier Lopez-Santiago

分类：机器学习

2020-05-17

这是模型选择和假设检测的边缘似然计算的最新介绍和概述。计算概率模型（或常量比率）的常规规定常数是许多统计数据，应用数学，信号处理和机器学习中的许多应用中的基本问题。本文提供了对主题的全面研究。我们突出了不同技术之间的局限性，优势，连接和差异。还描述了使用不正确的前沿的问题和可能的解决方案。通过理论比较和数值实验比较一些最相关的方法。

translated by 谷歌翻译

Variational Inference: A Review for Statisticians

David M. Blei , Alp Kucukelbir , Jon D. McAuliffe

分类：

2016-01-04

One of the core problems of modern statistics is to approximate difficult-to-compute probability densities. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation involving the posterior density. In this paper, we review variational inference (VI), a method from machine learning that approximates probability densities through optimization. VI has been used in many applications and tends to be faster than classical methods, such as Markov chain Monte Carlo sampling. The idea behind VI is to first posit a family of densities and then to find the member of that family which is close to the target. Closeness is measured by Kullback-Leibler divergence. We review the ideas behind mean-field variational inference, discuss the special case of VI applied to exponential family models, present a full example with a Bayesian mixture of Gaussians, and derive a variant that uses stochastic optimization to scale up to massive data. We discuss modern research in VI and highlight important open problems. VI is powerful, but it is not yet well understood. Our hope in writing this paper is to catalyze statistical research on this class of algorithms.

translated by 谷歌翻译

Distributed Computation for Marginal Likelihood based Model Choice

Alexander Buchholz , Daniel Ahfock , Sylvia Richardson

分类： (统计)机器学习

2019-10-10

我们提出了一种使用边缘似然的分布式贝叶斯模型选择的一般方法，其中数据集被分开在非重叠子集中。这些子集仅由个别工人本地访问，工人之间没有共享数据。我们近似通过在每个子集的每个子集上从后部采样通过Monte Carlo采样的完整数据的模型证据。结果使用一种新的方法来组合，该方法校正使用所产生的样本的汇总统计分裂。我们的鸿沟和征服方法使贝叶斯模型在大型数据设置中选择，利用所有可用信息，而是限制工人之间的沟通。我们派生了理论误差界限，这些错误界限量化了计算增益与精度损失之间的结果。当我们的真实世界实验所示，令人尴尬的平行性质在大规模数据集时产生了重要的速度。此外，我们展示了如何在可逆跳转设置中扩展建议的方法以在可逆跳转设置中进行模型选择，该跳转设置在一个运行中探讨多个特征组合。

translated by 谷歌翻译

Bayesian Variable Selection in a Million Dimensions

Martin Jankowiak

分类：机器学习 | (统计)机器学习

2022-08-02

贝叶斯变量选择是用于数据分析的强大工具，因为它为可变选择提供了原则性的方法，该方法可以说明事先信息和不确定性。但是，贝叶斯变量选择的广泛采用受到计算挑战的阻碍，尤其是在具有大量协变量P或非偶联的可能性的困难政权中。为了扩展到大型P制度，我们引入了一种有效的MCMC方案，其每次迭代的成本在P中是均等的。此外，我们还显示了如何将该方案扩展到用于计数数据的广义线性模型，这些模型在生物学，生态学，经济学，经济学，经济学，经济学，经济学，经济学，经济学上很普遍超越。特别是，我们设计有效的算法，用于二项式和负二项式回归中的可变选择，其中包括逻辑回归作为一种特殊情况。在实验中，我们证明了方法的有效性，包括对癌症和玉米基因组数据。

translated by 谷歌翻译

Bayesian Simultaneous Factorization and Prediction Using Multi-Omic Data

Sarah Samorodnitsky , Chris H. Wendt , Eric F. Lock

分类： (统计)机器学习

2022-11-29

Understanding of the pathophysiology of obstructive lung disease (OLD) is limited by available methods to examine the relationship between multi-omic molecular phenomena and clinical outcomes. Integrative factorization methods for multi-omic data can reveal latent patterns of variation describing important biological signal. However, most methods do not provide a framework for inference on the estimated factorization, simultaneously predict important disease phenotypes or clinical outcomes, nor accommodate multiple imputation. To address these gaps, we propose Bayesian Simultaneous Factorization (BSF). We use conjugate normal priors and show that the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. We then extend BSF to simultaneously predict a continuous or binary response, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation and full posterior inference for missing data, including "blockwise" missingness, and BSFP offers prediction of unobserved outcomes. We show via simulation that BSFP is competitive in recovering latent variation structure, as well as the importance of propagating uncertainty from the estimated factorization to prediction. We also study the imputation performance of BSF via simulation under missing-at-random and missing-not-at-random assumptions. Lastly, we use BSFP to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated OLD. Our analysis reveals a distinct cluster of patients with OLD driven by shared metabolomic and proteomic expression patterns, as well as multi-omic patterns related to lung function decline. Software is freely available at https://github.com/sarahsamorodnitsky/BSFP .

translated by 谷歌翻译

Sparse Horseshoe Estimation via Expectation-Maximisation

Shu Yu Tew , Daniel F. Schmidt , Enes Makalic

分类： (统计)机器学习 | 机器学习

2022-11-07

The horseshoe prior is known to possess many desirable properties for Bayesian estimation of sparse parameter vectors, yet its density function lacks an analytic form. As such, it is challenging to find a closed-form solution for the posterior mode. Conventional horseshoe estimators use the posterior mean to estimate the parameters, but these estimates are not sparse. We propose a novel expectation-maximisation (EM) procedure for computing the MAP estimates of the parameters in the case of the standard linear model. A particular strength of our approach is that the M-step depends only on the form of the prior and it is independent of the form of the likelihood. We introduce several simple modifications of this EM procedure that allow for straightforward extension to generalised linear models. In experiments performed on simulated and real data, our approach performs comparable, or superior to, state-of-the-art sparse estimation methods in terms of statistical performance and computational cost.

translated by 谷歌翻译

Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review

Alberto Caron , Gianluca Baio , Ioanna Manolopoulou

分类：机器学习 | (统计)机器学习

2020-09-14

大型观察数据越来越多地提供健康，经济和社会科学等学科，研究人员对因果问题而不是预测感兴趣。在本文中，从旨在调查参与学校膳食计划对健康指标的实证研究，研究了使用非参数回归的方法估算异质治疗效果的问题。首先，我们介绍了与观察或非完全随机数据进行因果推断相关的设置和相关的问题，以及如何在统计学习工具的帮助下解决这些问题。然后，我们审查并制定现有最先进的框架的统一分类，允许通过非参数回归模型来估算单个治疗效果。在介绍模型选择问题的简要概述后，我们说明了一些关于三种不同模拟研究的方法的性能。我们通过展示一些关于学校膳食计划数据的实证分析的一些方法的使用来结束。

translated by 谷歌翻译

The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo

Matthew D. Hoffman , Andrew Gelman

分类：

2011-11-18

Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information. These features allow it to converge to high-dimensional target distributions much more quickly than simpler methods such as random walk Metropolis or Gibbs sampling. However, HMC's performance is highly sensitive to two user-specified parameters: a step size and a desired number of steps L. In particular, if L is too small then the algorithm exhibits undesirable random walk behavior, while if L is too large the algorithm wastes computation. We introduce the No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L. NUTS uses a recursive algorithm to build a set of likely candidate points that spans a wide swath of the target distribution, stopping automatically when it starts to double back and retrace its steps. Empirically, NUTS perform at least as efficiently as and sometimes more efficiently than a well tuned standard HMC method, without requiring user intervention or costly tuning runs. We also derive a method for adapting the step size parameter on the fly based on primal-dual averaging. NUTS can thus be used with no hand-tuning at all. NUTS is also suitable for applications such as BUGS-style automatic inference engines that require efficient "turnkey" sampling algorithms.

translated by 谷歌翻译

A flexible empirical Bayes approach to multiple linear regression and connections with penalized regression

Youngseok Kim , Wei Wang , Peter Carbonetto , Matthew Stephens

分类： (统计)机器学习

2022-08-23

我们引入了一种新的经验贝叶斯方法，用于大规模多线性回归。我们的方法结合了两个关键思想：（i）使用灵活的“自适应收缩”先验，该先验近似于正常分布的有限混合物，近似于正常分布的非参数家族；（ii）使用变分近似来有效估计先前的超参数并计算近似后期。将这两个想法结合起来，将快速，灵活的方法与计算速度相当，可与快速惩罚的回归方法（例如Lasso）相当，并在各种场景中具有出色的预测准确性。此外，我们表明，我们方法中的后验平均值可以解释为解决惩罚性回归问题，并通过直接解决优化问题（而不是通过交叉验证来调整）从数据中学到的惩罚函数的精确形式。。我们的方法是在r https://github.com/stephenslab/mr.ash.ash.alpha的r软件包中实现的

translated by 谷歌翻译