决策树在机器学习中非常有名,通常会获得最新的性能。尽管如此,诸如CART,ID3,随机森林和Boosted树之类的知名变体错过了一个概率版本,该版本编码了有关树结构的先前假设,并在节点参数之间共享统计强度。贝叶斯决策树上的现有工作取决于马尔可夫链蒙特卡洛(MCMC),这可能在计算上很慢,尤其是在高维数据和昂贵的建议上。在这项研究中,我们提出了一种在普通笔记本电脑或个人计算机上平行单个MCMC决策树链的方法,使我们能够通过多核处理减少其运行时间,而结果在统计学上与常规顺序实现相同。我们还计算了运行时间的理论和实际减少,可以利用我们的多处理器体系结构的方法获得。实验表明,只要串行和并行实现在统计学上相同,我们可以达到18倍的运行时间。
translated by 谷歌翻译
贝叶斯正交(BQ)是一种解决贝叶斯方式中数值集成问题的方法,允许用户量化其对解决方案的不确定性。 BQ的标准方法基于Intains的高斯过程(GP)近似。结果,BQ本质上仅限于可以以有效的方式完成GP近似的情况,因此通常禁止非常高维或非平滑的目标功能。本文提出使用基于贝叶斯添加剂回归树(BART)前锋的新的贝叶斯数值集成算法来解决这个问题,我们调用Bart-Int。 BART Priors易于调整,适合不连续的功能。我们证明它们在顺序设计环境中,它们也会自然地借给自己,并且可以在各种设置中获得显式收敛速率。这种新方法的优点和缺点在包括Genz功能的一组基准测试和贝叶斯调查设计问题上突出显示。
translated by 谷歌翻译
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information. These features allow it to converge to high-dimensional target distributions much more quickly than simpler methods such as random walk Metropolis or Gibbs sampling. However, HMC's performance is highly sensitive to two user-specified parameters: a step size and a desired number of steps L. In particular, if L is too small then the algorithm exhibits undesirable random walk behavior, while if L is too large the algorithm wastes computation. We introduce the No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L. NUTS uses a recursive algorithm to build a set of likely candidate points that spans a wide swath of the target distribution, stopping automatically when it starts to double back and retrace its steps. Empirically, NUTS perform at least as efficiently as and sometimes more efficiently than a well tuned standard HMC method, without requiring user intervention or costly tuning runs. We also derive a method for adapting the step size parameter on the fly based on primal-dual averaging. NUTS can thus be used with no hand-tuning at all. NUTS is also suitable for applications such as BUGS-style automatic inference engines that require efficient "turnkey" sampling algorithms.
translated by 谷歌翻译
We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.
translated by 谷歌翻译
Default implementations of Bayesian Additive Regression Trees (BART) represent categorical predictors using several binary indicators, one for each level of each categorical predictor. Regression trees built with these indicators partition the levels using a ``remove one a time strategy.'' Unfortunately, the vast majority of partitions of the levels cannot be built with this strategy, severely limiting BART's ability to ``borrow strength'' across groups of levels. We overcome this limitation with a new class of regression tree and a new decision rule prior that can assign multiple levels to both the left and right child of a decision node. Motivated by spatial applications with areal data, we introduce a further decision rule prior that partitions the areas into spatially contiguous regions by deleting edges from random spanning trees of a suitably defined network. We implemented our new regression tree priors in the flexBART package, which, compared to existing implementations, often yields improved out-of-sample predictive performance without much additional computational burden. We demonstrate the efficacy of flexBART using examples from baseball and the spatiotemporal modeling of crime.
translated by 谷歌翻译
The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines "weak" tree models through a set of shrinkage priors, whereby each tree explains a small portion of the variability in the data. However, the lack of smoothness and the absence of a covariance structure over the observations in standard BART can yield poor performance in cases where such assumptions would be necessary. We propose Gaussian processes Bayesian additive regression trees (GP-BART) as an extension of BART which assumes Gaussian process (GP) priors for the predictions of each terminal node among all trees. We illustrate our model on simulated and real data and compare its performance to traditional modelling approaches, outperforming them in many scenarios. An implementation of our method is available in the R package rGPBART available at: https://github.com/MateusMaiaDS/gpbart
translated by 谷歌翻译
Monte Carlo Tree Search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarise the results from the key game and non-game domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work.
translated by 谷歌翻译
确定对特定干预措施(医疗或政策)响应特别好(或不良)的亚组,需要专门针对因果推理量身定制的新监督学习方法。贝叶斯因果森林(BCF)是一种最近的方法,已被记录在数据生成过程中,具有强烈混杂的方法,这种方法在许多应用中都具有合理的方式。本文开发了一种用于拟合BCF模型的新型算法,该算法比先前可用的Gibbs采样器更有效。新算法可用于初始化现有Gibbs采样器的独立链,从而使模拟研究中相关间隔估计值的后验探索和覆盖率更好。通过模拟研究和经验分析将新算法与相关方法进行比较。
translated by 谷歌翻译
决策树学习是机器学习中广泛使用的方法,在需要简洁明了的模型的应用中受到青睐。传统上,启发式方法用于快速生产具有相当高准确性的模型。然而,一个普遍的批评是,从精度和大小方面,所产生的树可能不一定是数据的最佳表示。近年来,这激发了最佳分类树算法的发展,这些算法与执行一系列本地最佳决策的启发式方法相比,在全球范围内优化决策树。我们遵循这一工作线,并提供了一种基于动态编程和搜索的最佳分类树的新颖算法。我们的算法支持对树的深度和节点数量的约束。我们方法的成功归因于一系列专门技术,这些技术利用了分类树独有的属性。传统上,最佳分类树的算法受到了高运行时的困扰和有限的可伸缩性,但我们在一项详细的实验研究中表明,我们的方法仅使用最先进的时间所需的时间,并且可以处理数十个数据集的数据集在数千个实例中,提供了几个数量级的改进,并特别有助于实现最佳决策树的实现。
translated by 谷歌翻译
贝叶斯网络是概率的图形模型,广泛用于了解高维数据的依赖关系,甚至促进因果发现。学习作为定向的非循环图(DAG)编码的底层网络结构是高度具有挑战性的,主要是由于大量可能的网络与非狭窄性约束结合。努力专注于两个前面:基于约束的方法,该方法执行条件独立测试,以排除具有贪婪或MCMC方案的DAG空间的边缘和分数和搜索方法。在这里,我们以一种新的混合方法综合这两个领域,这降低了基于约束方法的MCMC方法的复杂性。 MCMC方案中的各个步骤仅需要简单的表查找,以便可以有效地获得非常长的链。此外,该方案包括迭代过程,以校正来自条件独立测试的错误。该算法对替代方案提供了显着卓越的性能,特别是因为也可以从后部分布采样DAG,从而实现全面的贝叶斯模型为大量较大的贝叶斯网络进行平均。
translated by 谷歌翻译
Shap(Shapley添加说明)值是解释机器学习模型的主要工具之一,具有强大的理论保证(一致性,本地准确性)以及广泛的实现和用例。即使计算形状值通常需要指数时间,也要在基于树的模型上进行多项式时间。尽管加速很大,但Treeshap仍然可以主导具有数百万或更多条目的数据集上行业级的机器学习解决方案的计算时间,从而导致事后模型诊断和解释服务延迟。在本文中,我们介绍了两种新算法,即快速Treeshap V1和V2,旨在提高大型数据集的Treeshap的计算效率。我们从经验上发现,快速的Treeshap V1的速度比Treeshap快1.5倍,同时保持内存成本不变。同样,由于昂贵的Treeshap步骤的预先摄入的前,快速Treeshap V2比Treeshap快2.5倍,其成本略高。我们还表明,快速的Treeshap V2非常适合多时间模型解释,从而使新传入样品的解释更高。
translated by 谷歌翻译
我们介绍了Netket的版本3,机器学习工具箱适用于许多身体量子物理学。Netket围绕神经网络量子状态构建,并为其评估和优化提供有效的算法。这个新版本是基于JAX的顶部,一个用于Python编程语言的可差分编程和加速的线性代数框架。最重要的新功能是使用机器学习框架的简明符号来定义纯Python代码中的任意神经网络ANS \“凝固的可能性,这允许立即编译以及渐变的隐式生成自动化。Netket 3还带来了GPU和TPU加速器的支持,对离散对称组的高级支持,块以缩放多程度的自由度,Quantum动态应用程序的驱动程序,以及改进的模块化,允许用户仅使用部分工具箱是他们自己代码的基础。
translated by 谷歌翻译
确定点过程(DPP)是一个优雅的模型,可以为$ n $项目集合的每个子集分配概率。虽然传统上,DPP由对称内核矩阵进行参数化,从而消除了对称约束,从而导致非对称DPP(NDPP),从而导致建模功率和预测性能的显着改善。最近的工作研究了Markov Chain Monte Carlo(MCMC)对NDPPS的采样算法,该算法仅限于Size-$ K $子集(称为$ K $ -NDPPS)。但是,这种方法的运行时间在$ n $中是二次的,因此对于大规模设置而言,它是不可行的。在这项工作中,我们为$ k $ -ndpps提供了可扩展的MCMC采样算法,并具有低级内核,从而使运行时具有sublinear,in $ n $。我们的方法基于一种最新的NDPP排斥抽样算法,我们通过一种有效构建建议分布的新方法来增强该算法。此外,我们将可扩展的$ K $ -NDPP采样算法扩展到没有大小约束的情况下。我们最终的采样方法在内核等级中具有多项式时间复杂性,而现有方法的运行时为指数在等级中。通过对现实世界数据集的理论分析和实验,我们验证我们的可扩展近似采样算法比现有的$ k $ -ndpps和ndpps的现有采样方法快的阶数。
translated by 谷歌翻译
最近,经验可能性已在贝叶斯框架下广泛应用。马尔可夫链蒙特卡洛(MCMC)方法经常用于从感兴趣参数的后验分布中采样。然而,可能性支持的复杂性,尤其是非凸性的性质,在选择适当的MCMC算法时建立了巨大的障碍。这种困难限制了在许多应用中基于贝叶斯的经验可能性(贝叶赛)方法的使用。在本文中,我们提出了一个两步的大都会黑斯廷斯算法,以从贝耶斯后期进行采样。我们的建议是在层次上指定的,其中确定经验可能性的估计方程用于根据其余参数的建议值提出一组参数的值。此外,我们使用经验可能性讨论贝叶斯模型的选择,并将我们的两步大都会黑斯廷斯算法扩展到可逆的跳跃马尔可夫链蒙特卡洛手术程序,以便从最终的后验中采样。最后,提出了我们提出的方法的几种应用。
translated by 谷歌翻译
Analyzing the behavior of complex interdependent networks requires complete information about the network topology and the interdependent links across networks. For many applications such as critical infrastructure systems, understanding network interdependencies is crucial to anticipate cascading failures and plan for disruptions. However, data on the topology of individual networks are often publicly unavailable due to privacy and security concerns. Additionally, interdependent links are often only revealed in the aftermath of a disruption as a result of cascading failures. We propose a scalable nonparametric Bayesian approach to reconstruct the topology of interdependent infrastructure networks from observations of cascading failures. Metropolis-Hastings algorithm coupled with the infrastructure-dependent proposal are employed to increase the efficiency of sampling possible graphs. Results of reconstructing a synthetic system of interdependent infrastructure networks demonstrate that the proposed approach outperforms existing methods in both accuracy and computational time. We further apply this approach to reconstruct the topology of one synthetic and two real-world systems of interdependent infrastructure networks, including gas-power-water networks in Shelby County, TN, USA, and an interdependent system of power-water networks in Italy, to demonstrate the general applicability of the approach.
translated by 谷歌翻译
Markov链条具有可变长度是有用的解析随机模型,能够产生最静止的离散符号序列。这个想法是识别过去的过去,称为上下文,与预测未来的符号相关。有时单个状态是一个背景,并查看过去并找到这种特定状态,使得进一步过去无关紧要。具有此类属性的状态称为续订状态,它们可用于将链拆分为独立和相同的分布式块。为了识别具有可变长度的链条的续订状态,我们提出了使用内在贝叶斯因子来评估某些特定状态是更新状态的假设。在这种情况下,难度在于将随机上下文树的边缘后端分布集成在上下文树上的一般前提分布,在过渡概率之前,蒙特卡罗方法被应用。为了展示我们方法的强度,我们分析了从不同二进制模型模型生成的人工数据集和来自语言学领域的一个示例。
translated by 谷歌翻译
我们查看模型可解释性的特定方面:模型通常需要限制在大小上才能被认为是可解释的,例如,深度5的决策树比深度50中的一个更容易解释。但是,较小的模型也倾向于高偏见。这表明可解释性和准确性之间的权衡。我们提出了一种模型不可知论技术,以最大程度地减少这种权衡。我们的策略是首先学习甲骨文,这是培训数据上高度准确的概率模型。 Oracle预测的不确定性用于学习培训数据的抽样分布。然后,对使用此分布获得的数据样本进行了可解释的模型,通常会导致精确度明显更高。我们将抽样策略作为优化问题。我们的解决方案1具有以下关键的有利属性:(1)它使用固定数量的七个优化变量,而与数据的维度(2)无关,它是模型不可知的 - 因为可解释的模型和甲骨文都可能属于任意性模型家族(3)它具有模型大小的灵活概念,并且可以容纳向量大小(4)它是一个框架,使其能够从优化领域的进度中受益。我们还提出了以下有趣的观察结果:(a)通常,小型模型大小的最佳训练分布与测试分布不同; (b)即使可解释的模型和甲骨文来自高度截然不同的模型家族,也存在这种效果:我们通过使用封闭的复发单位网络作为甲骨文来提高决策树的序列分类精度,从而在文本分类任务上显示此效果。使用字符n-grams; (c)对于模型,我们的技术可用于确定给定样本量的最佳训练样本。
translated by 谷歌翻译
We consider the problem of estimating the interacting neighborhood of a Markov Random Field model with finite support and homogeneous pairwise interactions based on relative positions of a two-dimensional lattice. Using a Bayesian framework, we propose a Reversible Jump Monte Carlo Markov Chain algorithm that jumps across subsets of a maximal range neighborhood, allowing us to perform model selection based on a marginal pseudoposterior distribution of models. To show the strength of our proposed methodology we perform a simulation study and apply it to a real dataset from a discrete texture image analysis.
translated by 谷歌翻译
蒙特卡洛树搜索(MCT)是设计游戏机器人或解决顺序决策问题的强大方法。该方法依赖于平衡探索和开发的智能树搜索。MCT以模拟的形式进行随机抽样,并存储动作的统计数据,以在每个随后的迭代中做出更有教育的选择。然而,该方法已成为组合游戏的最新技术,但是,在更复杂的游戏(例如那些具有较高的分支因素或实时系列的游戏)以及各种实用领域(例如,运输,日程安排或安全性)有效的MCT应用程序通常需要其与问题有关的修改或与其他技术集成。这种特定领域的修改和混合方法是本调查的主要重点。最后一项主要的MCT调查已于2012年发布。自发布以来出现的贡献特别感兴趣。
translated by 谷歌翻译
石油场和地震成像的储层模拟被称为石油和天然气(O&G)行业中高性能计算(HPC)最苛刻的工作量。模拟器数值参数的优化起着至关重要的作用,因为它可以节省大量的计算工作。最先进的优化技术基于运行大量模拟,特定于该目的,以找到良好的参数候选者。但是,在时间和计算资源方面,使用这种方法的成本高昂。这项工作提出了金枪鱼,这是一种新方法,可增强使用性能模型的储层流仿真的最佳数值参数的搜索。在O&G行业中,通常使用不同工作流程中的模型合奏来减少与预测O&G生产相关的不确定性。我们利用此类工作流程中这些合奏的运行来从每个模拟中提取信息,并在其后续运行中优化数值参数。为了验证该方法,我们在历史匹配(HM)过程中实现了它,该过程使用Kalman滤波器算法来调整储层模型的集合以匹配实际字段中观察到的数据。我们从许多具有不同数值配置的模拟中挖掘了过去的执行日志,并根据数据提取的功能构建机器学习模型。这些功能包括储层模型本身的属性,例如活动单元的数量,即模拟行为的统计数据,例如线性求解器的迭代次数。采样技术用于查询甲骨文以找到可以减少经过的时间的数值参数,而不会显着影响结果的质量。我们的实验表明,预测可以平均将HM工作流程运行时提高31%。
translated by 谷歌翻译