我们提出了一种自适应方法来构建贝叶斯推理的高斯过程,并使用昂贵的评估正演模型。我们的方法依赖于完全贝叶斯方法来训练高斯过程模型,并利用贝叶斯全局优化的预期改进思想。我们通过最大化高斯过程模型与噪声观测数据拟合的预期改进来自适应地构建训练设计。对合成数据模型问题的数值实验证明了所获得的自适应设计与固定非自适应设计相比,在前向模型推断成本的精确后验估计方面的有效性。
translated by 谷歌翻译
我们开发了一种自动变分方法,用于推导具有高斯过程(GP)先验和一般可能性的模型。该方法支持多个输出和多个潜在函数,不需要条件似然的详细知识,只需将其评估为ablack-box函数。使用高斯混合作为变分分布,我们表明使用来自单变量高斯分布的样本可以有效地估计证据下界及其梯度。此外,该方法可扩展到大数据集,这是通过使用诱导变量使用增广先验来实现的。支持最稀疏GP近似的方法,以及并行计算和随机优化。我们在小数据集,中等规模数据集和大型数据集上定量和定性地评估我们的方法,显示其在不同似然模型和稀疏性水平下的竞争力。在涉及航空延误预测和手写数字分类的大规模实验中,我们表明我们的方法与可扩展的GP回归和分类的最先进的硬编码方法相同。
translated by 谷歌翻译
我们考虑学习嘈杂的黑盒功能超过给定阈值的水平集的问题。为了有效地重建水平集,我们研究了高斯过程(GP)元模型。我们的重点是强随机采样器,特别是重尾模拟噪声和低信噪比。为了防止噪声错误指定,我们评估了三个变量的性能:(i)具有Student-$ t $观察值的GP; (ii)学生 - $ t $流程(TP); (iii)分类GP对响应的符号进行建模。作为第四个扩展,我们研究具有单调性约束的GP代理,这些约束在已知连接的级别集时是相关的。结合这些模型,我们分析了几个采集函数,用于指导顺序实验设计,将现有的逐步不确定性减少标准扩展到随机轮廓发现环境。这也促使我们开发(近似)更新公式以有效地计算取代函数。我们的方案通过在1-6维度中使用各种合成实验进行基准测试。我们还考虑应用水平集估计来确定最佳的运动政策和百慕大金融期权的估值。
translated by 谷歌翻译
We propose a novel sampling framework for inference in probabilistic models: an active learning approach that converges more quickly (in wall-clock time) than Markov chain Monte Carlo (MCMC) benchmarks. The central challenge in proba-bilistic inference is numerical integration, to average over ensembles of models or unknown (hyper-)parameters (for example to compute the marginal likelihood or a partition function). MCMC has provided approaches to numerical integration that deliver state-of-the-art inference, but can suffer from sample inefficiency and poor convergence diagnostics. Bayesian quadrature techniques offer a model-based solution to such problems, but their uptake has been hindered by prohibitive computation costs. We introduce a warped model for probabilistic integrands (like-lihoods) that are known to be non-negative, permitting a cheap active learning scheme to optimally select sample locations. Our algorithm is demonstrated to offer faster convergence (in seconds) relative to simple Monte Carlo and annealed importance sampling on both synthetic and real-world examples.
translated by 谷歌翻译
This paper considers the robust and efficient implementation of Gaussian process regression with a Student-t observation model, which has a non-log-concave likelihood. The challenge with the Student-t model is the analytically intractable inference which is why several approximative methods have been proposed. Expectation propagation (EP) has been found to be a very accurate method in many empirical studies but the convergence of EP is known to be problematic with models containing non-log-concave site functions. In this paper we illustrate the situations where standard EP fails to converge and review different modifications and alternative algorithms for improving the convergence. We demonstrate that convergence problems may occur during the type-II maximum a posteriori (MAP) estimation of the hyperparameters and show that standard EP may not converge in the MAP values with some difficult data sets. We present a robust implementation which relies primarily on parallel EP updates and uses a moment-matching-based double-loop algorithm with adaptively selected step size in difficult cases. The predictive performance of EP is compared with Laplace, variational Bayes, and Markov chain Monte Carlo approximations.
translated by 谷歌翻译
We develop an automated variational method for approximate inference in Gaus-sian process (GP) models whose posteriors are often intractable. Using a mixture of Gaussians as the variational distribution, we show that (i) the variational objective and its gradients can be approximated efficiently via sampling from univari-ate Gaussian distributions and (ii) the gradients wrt the GP hyperparameters can be obtained analytically regardless of the model likelihood. We further propose two instances of the variational distribution whose covariance matrices can be parametrized linearly in the number of observations. These results allow gradient-based optimization to be done efficiently in a black-box manner. Our approach is thoroughly verified on five models using six benchmark datasets, performing as well as the exact or hard-coded implementations while running orders of magnitude faster than the alternative MCMC sampling approaches. Our method can be a valuable tool for practitioners and researchers to investigate new models with minimal effort in deriving model-specific inference algorithms.
translated by 谷歌翻译
Standard Gaussian processes (GPs) model observations' noise as constant throughout input space. This is often a too restrictive assumption, but one that is needed for GP inference to be tractable. In this work we present a non-standard variational approximation that allows accurate inference in heteroscedastic GPs (i.e., under input-dependent noise conditions). Computational cost is roughly twice that of the standard GP, and also scales as O(n 3). Accuracy is verified by comparing with the golden standard MCMC and its effectiveness is illustrated on several synthetic and real datasets of diverse characteristics. An application to volatility forecasting is also considered.
translated by 谷歌翻译
We propose a simple and effective variational inference algorithm based on stochastic optimi-sation that can be widely applied for Bayesian non-conjugate inference in continuous parameter spaces. This algorithm is based on stochastic approximation and allows for efficient use of gradient information from the model joint density. We demonstrate these properties using illustrative examples as well as in challenging and diverse Bayesian inference problems such as variable selection in logistic regression and fully Bayesian inference over kernel hyperparameters in Gaussian process regression.
translated by 谷歌翻译
近似贝叶斯计算(ABC)是贝叶斯推理的一种方法,当可能性不可用时,但是可以从模型中进行模拟。然而,许多ABC算法需要大量的模拟,这可能是昂贵的。为了降低计算成本,已经提出了贝叶斯优化(BO)和诸如高斯过程的模拟模型。贝叶斯优化使人们可以智能地决定在哪里评估模型下一个,但是常见的BO策略不是为了估计后验分布而设计的。我们的论文解决了文献中的这一差距。我们建议计算ABC后验密度的不确定性,这是因为缺乏模拟来准确估计这个数量,并且定义了测量这种不确定性的aloss函数。然后,我们建议选择下一个评估位置,以尽量减少预期的损失。实验表明,与普通BO策略相比,所提出的方法通常产生最准确的近似。
translated by 谷歌翻译
Numerical integration is a key component of many problems in scientific computing , statistical modelling, and machine learning. Bayesian Quadrature is a model-based method for numerical integration which, relative to standard Monte Carlo methods, offers increased sample efficiency and a more robust estimate of the uncertainty in the estimated integral. We propose a novel Bayesian Quadrature approach for numerical integration when the integrand is non-negative, such as the case of computing the marginal likelihood, predictive distribution, or normal-ising constant of a probabilistic model. Our approach approximately marginalises the quadrature model's hyperparameters in closed form, and introduces an active learning scheme to optimally select function evaluations, as opposed to using Monte Carlo samples. We demonstrate our method on both a number of synthetic benchmarks and a real scientific problem from astronomy.
translated by 谷歌翻译
使用高斯过程的贝叶斯优化是处理昂贵的黑盒功能优化的流行方法。然而,由于经典GaussianProcesses的协方差矩阵的平稳性的先验,该方法可能不适用于优化问题中涉及的非平稳函数。为了克服这个问题,提出了一种新的贝叶斯优化方法。它基于深度高斯过程的assurrogate模型而不是经典的高斯过程。该建模技术通过简单地考虑静态高斯过程的功能组合来提高表示的能力以捕获非平稳性,从而提供多层结构。本文提出了一种新的全局优化算法,通过耦合深度高斯过程和贝叶斯优化算法。通过学术测试案例讨论并突出了这种优化方法的特殊性。所提出的算法的性能在分析测试用例和航空设计优化问题上进行评估,并与最先进的固定和非静态贝叶斯优化方法进行比较。
translated by 谷歌翻译
当只能获得有限数量的noisylog-likelihood评估时,我们考虑贝叶斯推断。例如,当基于复杂模拟器的统计模型适合于数据时,这发生,并且使用合成似然(SL)来使用计算成本高的前向模拟来形成噪声对数似然估计。我们将推理任务构建为贝叶斯序列设计问题,其中对数似然函数使用分层高斯过程(GP)代理模型进行建模,该模型用于有效地选择其他对数似然评估位置。最近在批处理贝叶斯优化中取得了进展,我们开发了各种顺序策略,其中自适应地选择多个模拟以最小化预期或中值损失函数,从而测量所得到的后验中的不确定性。我们从理论上和经验上分析了所得方法的性质。玩具问题和三个模拟模型的实验表明我们的方法是稳健的,高度可并行的,并且样本有效。
translated by 谷歌翻译
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (gener-alized) additive models, smoothing spline models, state space models, semiparametric regression , spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of struc-tured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality , which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.
translated by 谷歌翻译
获取有关嘈杂昂贵的黑盒功能(计算机模拟或物理实验)的信息是一个极具挑战性的问题。有限的计算和财务资源限制了传统方法在实验设计中的应用。当问题中的感兴趣量(QoI)取决于昂贵的黑盒功能时,问题就会被诸如数值误差和随机近似误差等障碍所克服。贝叶斯优化设计的实验已经相当成功地引导设计者针对上述问题的QoI。这通常是通过按照与效用理论兼容的填充采样标准选择的设计顺序查询函数来实现的。但是,大多数当前方法在语义上设计为仅用于优化或推断黑盒功能本身。我们的目标是构建一种启发式方法,无论QoI如何,都能明确地处理上述问题。本文采用上述启发式推断出特定的QoI,即函数的期望值(期望值)。 Kullback Leibler(KL)的差异在用于量化信息增益的技术中相当明显。在本文中,我们推导出预期KLdivergence的表达式,以顺序推断我们的QoI。由Karhunene Loeve扩展围绕黑盒函数的高斯过程(GP)表示提供的分析易处理性允许绕过与样本平均相关的数值问题。建议的方法可以通过合理的假设扩展到任何QoI。所提出的方法在三个具有不同复杂度和维数水平的综合函数上得到验证和验证。我们在钢丝制造问题上展示了我们的方法论。
translated by 谷歌翻译
We propose a general algorithm for approximating nonstandard Bayesian posterior distributions. The algorithm minimizes the Kullback-Leibler divergence of an approximating distribution to the intractable posterior distribution. Our method can be used to approximate any posterior distribution, provided that it is given in closed form up to the proportionality constant. The approximation can be any distribution in the exponential family or any mixture of such distributions, which means that it can be made arbitrarily precise. Several examples illustrate the speed and accuracy of our approximation method in practice.
translated by 谷歌翻译
Gaussian process (GP) models are widely used in disease mapping as they provide a natural framework for modeling spatial correlations. Their challenges, however, lie in computational burden and memory requirements. In disease mapping models, the other difficulty is inference, which is analytically intractable due to the non-Gaussian observation model. In this paper, we address both these challenges. We show how to efficiently build fully and partially independent conditional (FIC/PIC) sparse approximations for the GP in two-dimensional surface, and how to conduct approximate inference using expectation propagation (EP) algorithm and Laplace approximation (LA). We also propose to combine FIC with a compactly supported covariance function to construct a computationally efficient additive model that can model long and short length-scale spatial correlations simultaneously. The benefit of these approximations is computational. The sparse GPs speed up the computations and reduce the memory requirements. The posterior inference via EP and Laplace approximation is much faster and is practically as accurate as via Markov chain Monte Carlo.
translated by 谷歌翻译
We propose a novel approach for nonlinear regression using a two-layer neural network (NN) model structure with sparsity-favoring hierarchical priors on the network weights. We present an expectation propagation (EP) approach for approximate integration over the posterior distribution of the weights, the hierarchical scale parameters of the priors, and the residual scale. Using a factorized posterior approximation we derive a computation-ally efficient algorithm, whose complexity scales similarly to an ensemble of independent sparse linear models. The approach enables flexible definition of weight priors with different sparseness properties such as independent Laplace priors with a common scale parameter or Gaussian automatic relevance determination (ARD) priors with different relevance parameters for all inputs. The approach can be extended beyond standard activation functions and NN model structures to form flexible nonlinear predictors from multiple sparse linear models. The effects of the hierarchical priors and the predictive performance of the algorithm are assessed using both simulated and real-world data. Comparisons are made to two alternative models with ARD priors: a Gaussian process with a NN covariance function and marginal maximum a posteriori estimates of the relevance parameters, and a NN with Markov chain Monte Carlo integration over all the unknown model parameters.
translated by 谷歌翻译
We provide a comprehensive overview of many recent algorithms for approximate inference in Gaussian process models for probabilistic binary classification. The relationships between several approaches are elucidated theoretically, and the properties of the different algorithms are corroborated by experimental results. We examine both 1) the quality of the predictive distributions and 2) the suitability of the different marginal likelihood approximations for model selection (selecting hyperparameters) and compare to a gold standard based on MCMC. Interestingly, some methods produce good predictive distributions although their marginal likelihood approximations are poor. Strong conclusions are drawn about the methods: The Expectation Propagation algorithm is almost always the method of choice unless the computational budget is very tight. We also extend existing methods in various ways, and provide unifying code implementing all approaches.
translated by 谷歌翻译
我们提出了一种近似贝叶斯推理方法来估计非均匀泊松过程的强度,其中强度函数通过先前的S形连接函数使用高斯过程(GP)进行建模。使用潜在的标记泊松过程和P \'olya-建立模型。 -Gammarandom变量我们获得与GP先前共轭的可能性的表示。我们使用自由形式的平均场近似以及稀疏GP的框架来近似后验。此外,作为替代近似,我们建议对后验的稀疏拉普拉斯近似,为此找到一个有效的期望 - 最大化算法来找到后验模式。两种算法的结果与通过马尔可夫链蒙特卡罗采样器和标准变分高斯方法获得的精确推断相比较,同时快一个数量级。
translated by 谷歌翻译
Our paper deals with inferring simulator-based statistical models given some observed data. A simulator-based model is a parametrized mechanism which specifies how data are generated. It is thus also referred to as generative model. We assume that only a finite number of parameters are of interest and allow the generative process to be very general; it may be a noisy nonlinear dynamical system with an unrestricted number of hidden variables. This weak assumption is useful for devising realistic models but it renders statistical inference very difficult. The main challenge is the intractability of the likelihood function. Several likelihood-free inference methods have been proposed which share the basic idea of identifying the parameters by finding values for which the discrepancy between simulated and observed data is small. A major obstacle to using these methods is their computational cost. The cost is largely due to the need to repeatedly simulate data sets and the lack of knowledge about how the parameters affect the discrepancy. We propose a strategy which combines probabilistic modeling of the discrepancy with optimization to facilitate likelihood-free inference. The strategy is implemented using Bayesian optimization and is shown to accelerate the inference through a reduction in the number of required simulations by several orders of magnitude.
translated by 谷歌翻译