贝叶斯优化(BO)是指用于对昂贵的黑盒函数进行全局优化的一套技术,它使用函数的内省贝叶斯模型来有效地找到最优值。虽然BO已经在许多应用中成功应用,但现代优化任务迎来了传统方法失败的新挑战。在这项工作中,我们展示了Dragonfly,这是一个开源Python库,用于可扩展和强大的BO.Dragonfly包含多个最近开发的方法,允许BO应用于具有挑战性的现实世界环境;这些包括更好的处理更高维域的方法,当昂贵函数的廉价近似可用时处理多保真评估的方法,优化结构化组合空间的方法,例如神经网络架构的空间,以及处理并行评估的方法。此外,我们在BO中开发了新的方法改进,用于选择贝叶斯模型,选择采集函数,以及优化具有不同变量类型和附加约束的过复杂域。我们将Dragonfly与一套用于全局优化的其他软件包和算法进行比较,并证明当上述方法集成时,它们可以显着改善BO的性能。 Dragonfly图书馆可在dragonfly.github.io上找到。
translated by 谷歌翻译
Bayesian optimization is a sample-efficient method for black-box global optimization. However , the performance of a Bayesian optimization method very much depends on its exploration strategy, i.e. the choice of acquisition function , and it is not clear a priori which choice will result in superior performance. While portfolio methods provide an effective, principled way of combining a collection of acquisition functions, they are often based on measures of past performance which can be misleading. To address this issue, we introduce the Entropy Search Portfolio (ESP): a novel approach to portfolio construction which is motivated by information theoretic considerations. We show that ESP outperforms existing portfolio methods on several real and synthetic problems, including geostatistical datasets and simulated control tasks. We not only show that ESP is able to offer performance as good as the best, but unknown, acquisition function, but surprisingly it often gives better performance. Finally , over a wide range of conditions we find that ESP is robust to the inclusion of poor acquisition functions.
translated by 谷歌翻译
贝叶斯优化和Lipschitz优化已经开发出用于优化黑盒功能的替代技术。它们各自利用关于函数的不同形式的先验。在这项工作中,我们探索了这些技术的策略,以便更好地进行全局优化。特别是,我们提出了在传统BO算法中使用Lipschitz连续性假设的方法,我们称之为Lipschitz贝叶斯优化(LBO)。这种方法不会增加渐近运行时间,并且在某些情况下会大大提高性能(而在最坏的情况下,性能类似)。实际上,在一个特定的环境中,我们证明使用Lipschitz信息产生与后悔相同或更好的界限,而不是单独使用贝叶斯优化。此外,我们提出了一个简单的启发式方法来估计Lipschitz常数,并证明Lipschitz常数的增长估计在某种意义上是“无害的”。我们对具有4个采集函数的15个数据集进行的实验表明,在最坏的情况下,LBO的表现类似于底层BO方法,而在某些情况下,它的表现要好得多。特别是汤普森采样通常看到了极大的改进(因为Lipschitz信息已经得到了很好的修正) - 探索“现象”及其LBO变体通常优于其他采集功能。
translated by 谷歌翻译
Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions. Many real-world optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained problems in the general case that noise may be present in the constraint functions, and the objective and constraints may be evaluated independently. We provide motivating practical examples, and present a general framework to solve such problems. We demonstrate the effectiveness of our approach on optimizing the performance of online latent Dirichlet allocation subject to topic sparsity constraints, tuning a neural network given test-time memory constraints, and optimizing Hamiltonian Monte Carlo to achieve maximal effectiveness in a fixed time, subject to passing standard convergence diagnostics.
translated by 谷歌翻译
Entropy Search (ES) and Predictive Entropy Search (PES) are popular and empirically successful Bayesian Optimization techniques. Both rely on a compelling information-theoretic motivation , and maximize the information gained about the arg max of the unknown function; yet, both are plagued by the expensive computation for estimating entropies. We propose a new criterion , Max-value Entropy Search (MES), that instead uses the information about the maximum function value. We show relations of MES to other Bayesian optimization methods, and establish a regret bound. We observe that MES maintains or improves the good empirical performance of ES/PES, while tremendously lightening the computational burden. In particular, MES is much more robust to the number of samples used for computing the entropy, and hence more efficient for higher dimensional problems.
translated by 谷歌翻译
We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution. This reformulation allows PES to obtain approximations that are both more accurate and efficient than other alternatives such as Entropy Search (ES). Furthermore , PES can easily perform a fully Bayesian treatment of the model hy-perparameters while ES cannot. We evaluate PES in both synthetic and real-world applications, including optimization problems in machine learning, finance, biotechnology, and robotics. We show that the increased accuracy of PES leads to significant gains in optimization performance.
translated by 谷歌翻译
Bayesian optimization has recently been proposed as a framework for automatically tuning the hyperparameters of machine learning models and has been shown to yield state-of-the-art performance with impressive ease and efficiency. In this paper, we explore whether it is possible to transfer the knowledge gained from previous optimizations to new tasks in order to find optimal hyperparameter settings more efficiently. Our approach is based on extending multi-task Gaussian processes to the framework of Bayesian optimization. We show that this method significantly speeds up the optimization process when compared to the standard single-task approach. We further propose a straightforward extension of our algorithm in order to jointly minimize the average error across multiple tasks and demonstrate how this can be used to greatly speed up k-fold cross-validation. Lastly, we propose an adaptation of a recently developed acquisition function, en-tropy search, to the cost-sensitive, multi-task setting. We demonstrate the utility of this new acquisition function by leveraging a small dataset to explore hyper-parameter settings for a large dataset. Our algorithm dynamically chooses which dataset to query in order to yield the most information per unit cost.
translated by 谷歌翻译
近似贝叶斯计算(ABC)是贝叶斯推理的一种方法,当可能性不可用时,但是可以从模型中进行模拟。然而,许多ABC算法需要大量的模拟,这可能是昂贵的。为了降低计算成本,已经提出了贝叶斯优化(BO)和诸如高斯过程的模拟模型。贝叶斯优化使人们可以智能地决定在哪里评估模型下一个,但是常见的BO策略不是为了估计后验分布而设计的。我们的论文解决了文献中的这一差距。我们建议计算ABC后验密度的不确定性,这是因为缺乏模拟来准确估计这个数量,并且定义了测量这种不确定性的aloss函数。然后,我们建议选择下一个评估位置,以尽量减少预期的损失。实验表明,与普通BO策略相比,所提出的方法通常产生最准确的近似。
translated by 谷歌翻译
Chemical space is so large that brute force searches for new interesting molecules are in-feasible. High-throughput virtual screening via computer cluster simulations can speed up the discovery process by collecting very large amounts of data in parallel, e.g., up to hundreds or thousands of parallel measurements. Bayesian optimization (BO) can produce additional acceleration by sequentially identifying the most useful simulations or experiments to be performed next. However, current BO methods cannot scale to the large numbers of parallel measurements and the massive libraries of molecules currently used in high-throughput screening. Here, we propose a scalable solution based on a parallel and distributed implementation of Thompson sampling (PDTS). We show that, in small scale problems, PDTS performs similarly as parallel expected improvement (EI), a batch version of the most widely used BO heuristic. Additionally , in settings where parallel EI does not scale, PDTS outperforms other scalable baselines such as a greedy search,-greedy approaches and a random search method. These results show that PDTS is a successful solution for large-scale parallel BO.
translated by 谷歌翻译
我们提出了一种自适应方法来构建贝叶斯推理的高斯过程,并使用昂贵的评估正演模型。我们的方法依赖于完全贝叶斯方法来训练高斯过程模型,并利用贝叶斯全局优化的预期改进思想。我们通过最大化高斯过程模型与噪声观测数据拟合的预期改进来自适应地构建训练设计。对合成数据模型问题的数值实验证明了所获得的自适应设计与固定非自适应设计相比,在前向模型推断成本的精确后验估计方面的有效性。
translated by 谷歌翻译
我们开发了一种自动变分方法,用于推导具有高斯过程(GP)先验和一般可能性的模型。该方法支持多个输出和多个潜在函数,不需要条件似然的详细知识,只需将其评估为ablack-box函数。使用高斯混合作为变分分布,我们表明使用来自单变量高斯分布的样本可以有效地估计证据下界及其梯度。此外,该方法可扩展到大数据集,这是通过使用诱导变量使用增广先验来实现的。支持最稀疏GP近似的方法,以及并行计算和随机优化。我们在小数据集,中等规模数据集和大型数据集上定量和定性地评估我们的方法,显示其在不同似然模型和稀疏性水平下的竞争力。在涉及航空延误预测和手写数字分类的大规模实验中,我们表明我们的方法与可扩展的GP回归和分类的最先进的硬编码方法相同。
translated by 谷歌翻译
We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments-active user modelling with preferences, and hierarchical reinforcement learning-and a discussion of the pros and cons of Bayesian optimization based on our experiences.
translated by 谷歌翻译
贝叶斯优化是一种优化目标函数的方法,需要花费很长时间(几分钟或几小时)来评估。它最适合于在小于20维的连续域上进行优化,并且在功能评估中容忍随机噪声。它构建了目标的替代品,并使用贝叶斯机器学习技术,高斯过程回归量化该替代品中的不确定性,然后使用从该代理定义的获取函数来决定在何处进行抽样。在本教程中,我们描述了贝叶斯优化的工作原理,包括高斯过程回归和三种常见的采集功能:预期改进,熵搜索和知识梯度。然后,我们讨论了更先进的技术,包括在并行,多保真和多信息源优化,昂贵的评估约束,随机环境条件,多任务贝叶斯优化以及包含衍生信息的情况下运行多功能评估。最后,我们讨论了贝叶斯优化软件和该领域未来的研究方向。在我们的教程材料中,我们提供了对噪声评估的预期改进的时间化,超出了无噪声设置,在更常用的情况下。这种概括通过正式的决策理论论证来证明,与先前的临时修改形成鲜明对比。
translated by 谷歌翻译
Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide its search process. Fully maximizing acquisition functions produces the Bayes' decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We first show that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization. Subsequently, we identify a common family of acquisition functions, including EI and UCB, whose properties not only facilitate but justify use of greedy approaches for their maximization.
translated by 谷歌翻译
贝叶斯优化(BO)是黑盒优化的有效工具,其中目标函数评估通常非常昂贵。在实践中,目标函数的低保真度近似值通常是可用的。最近,多保真贝叶斯优化(MFBO)引起了人们的关注,因为它可以通过使用那些更便宜的观测来显着加速优化过程。我们提出了一种新的MFBO信息理论方法。基于信息的方法在BO中很受欢迎,但是基于信息的MFBO的现有研究受到难以准确估计信息增益的困扰。 Ourapproach基于一种基于信息的BO变体,称为最大值熵搜索(MES),它极大地便于评估MFBO中的信息增益。实际上,我们的采集函数的计算是在分析上编写的,除了一维积分和采样之外,可以有效和准确地计算。我们通过使用合成和基准数据集证明了我们方法的有效性,并进一步展示了材料科学数据的实际应用。
translated by 谷歌翻译
高效全局优化(EGO)广泛用于优化计算上昂贵的黑盒功能。它使用基于高斯过程(克里金法)的替代建模技术。然而,由于使用了静态协方差,克里金不太适合近似非平稳函数。本文探讨了深度高斯过程(DGP)在EGO框架中的整合,以处理非平稳问题,并研究诱发的挑战和机遇。对分析问题进行数值实验以突出DGP和EGO的不同方面。
translated by 谷歌翻译
贝叶斯优化通常假设给出贝叶斯先验。然而,贝叶斯优化中强有力的理论保证在实践中经常因为先验中的未知参数而受到损害。在本文中,我们采用经验贝叶斯的变量并表明,通过估计从同一个先前采样的离线数据之前的高斯过程和构建后验的无偏估计,GP-UCB的变体和改进概率实现近乎零的后悔界限,其随着离线数据和离线数据的数量减少到与观测噪声成比例的常数。在线评估的数量增加。根据经验,我们已经验证了我们的方法,以挑战模拟机器人问题为特色的任务和运动规划。
translated by 谷歌翻译
变分推理已成为一种广泛使用的方法来逼近复杂潜变量模型中的内部。然而,推导变分推理算法通常需要进行大量的模型特定分析,并且这些努力可能阻碍并阻止我们快速开发和探索手头问题的各种模型。在本文中,我们提出了一个“黑盒”变分推理算法,该算法可以快速应用于多个模型而几乎没有其他推导。我们的方法基于变分目标的随机优化,其中噪声梯度是从变分布的Monte Carlo样本计算的。我们开发了许多方法来减少梯度的方差,始终保持我们想要避免基于模型的难以推导的标准。我们根据相应的黑箱采样方法评估我们的方法。我们发现我们的方法比采样方法更快地达到更好的预测可能性。最后,我们证明Black Box Variational Inference可以通过快速构建和评估几种纵向医疗保健数据模型,轻松探索广泛的模型空间。
translated by 谷歌翻译
许多对科学计算和机器学习感兴趣的概率模型具有昂贵的黑盒可能性,这些可能性阻止了贝叶斯推理的标准技术的应用,例如MCMC,其需要接近梯度或大量可能性评估。我们在这里介绍一种新颖的样本有效推理框架,VariationalBayesian Monte Carlo(VBMC)。 VBMC将变分推理与基于高斯过程的有源采样贝叶斯积分结合起来,使用latterto有效逼近变分目标中的难以求的积分。我们的方法产生了后验分布的非参数近似和模型证据的近似下界,对模型选择很有用。我们在几种合成可能性和神经元模型上展示VBMC,其中包含来自真实神经元的数据。在所有测试的问题和维度(高达$ D = 10 $)中,VBMC始终如一地通过有限的可能性评估预算重建后验证和模型证据,而不像其他仅在非常低维度下工作的方法。我们的框架作为一种新颖的工具,具有昂贵的黑盒可能性,可用于后期模型推理。
translated by 谷歌翻译
本文提出了采集汤普森采样(ATS),这是一种基于随机过程采样多采集函数的思想的批量贝叶斯优化算法(BO)。我们通过采集函数对一组模型参数的依赖来定义该过程。 ATS在概念上简单,直接实现,与其他批处理BO方法不同,它可以用于并行化任何顺序采集功能。为了提高多模态任务的性能,我们表明ATS可以与现有技术结合以实现不同探索 - 利用交易,并考虑未决的功能评估。我们在各种基准函数和流行的梯度增强树算法的超参数优化上进行了实验。这些证明了我们的算法与两个最先进的批量BO方法的竞争力,以及它对经典并行Thompson采样BO的优势。
translated by 谷歌翻译