Entropy Search (ES) and Predictive Entropy Search (PES) are popular and empirically successful Bayesian Optimization techniques. Both rely on a compelling information-theoretic motivation , and maximize the information gained about the arg max of the unknown function; yet, both are plagued by the expensive computation for estimating entropies. We propose a new criterion , Max-value Entropy Search (MES), that instead uses the information about the maximum function value. We show relations of MES to other Bayesian optimization methods, and establish a regret bound. We observe that MES maintains or improves the good empirical performance of ES/PES, while tremendously lightening the computational burden. In particular, MES is much more robust to the number of samples used for computing the entropy, and hence more efficient for higher dimensional problems.
translated by 谷歌翻译
We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution. This reformulation allows PES to obtain approximations that are both more accurate and efficient than other alternatives such as Entropy Search (ES). Furthermore , PES can easily perform a fully Bayesian treatment of the model hy-perparameters while ES cannot. We evaluate PES in both synthetic and real-world applications, including optimization problems in machine learning, finance, biotechnology, and robotics. We show that the increased accuracy of PES leads to significant gains in optimization performance.
translated by 谷歌翻译
当出现具有不同成本的多个相互依赖的信息源时,我们如何有效地收集信息以优化未知功能?例如,在优化机器人系统时,智能地交换计算机模拟和真实的机器人测试可以带来显着的节省。现有方法,例如基于多保真GP-UCB或基于熵搜索的方法,或者对不同保真度的交互作出简单假设,或者使用缺乏理论保证的简单启发法。在本文中,我们研究多保真贝叶斯优化与多输出之间的复杂结构依赖关系,并提出了MF-MI-Greedy,这是一个解决这个问题的原理算法框架。特别是,我们使用基于共享潜在结构的加性高斯过程来模拟不同的保真度。目标函数。然后,我们使用成本敏感的互信息增益进行有效的贝叶斯全局优化。我们提出一个简单的遗憾概念,其中包含不同保真度的成本,并证明MF-MI-Greedy实现了低度遗憾。我们在合成数据集和真实数据集上展示了我们算法的强大经验性能。
translated by 谷歌翻译
贝叶斯优化是一种优化目标函数的方法,需要花费很长时间(几分钟或几小时)来评估。它最适合于在小于20维的连续域上进行优化,并且在功能评估中容忍随机噪声。它构建了目标的替代品,并使用贝叶斯机器学习技术,高斯过程回归量化该替代品中的不确定性,然后使用从该代理定义的获取函数来决定在何处进行抽样。在本教程中,我们描述了贝叶斯优化的工作原理,包括高斯过程回归和三种常见的采集功能:预期改进,熵搜索和知识梯度。然后,我们讨论了更先进的技术,包括在并行,多保真和多信息源优化,昂贵的评估约束,随机环境条件,多任务贝叶斯优化以及包含衍生信息的情况下运行多功能评估。最后,我们讨论了贝叶斯优化软件和该领域未来的研究方向。在我们的教程材料中,我们提供了对噪声评估的预期改进的时间化,超出了无噪声设置,在更常用的情况下。这种概括通过正式的决策理论论证来证明,与先前的临时修改形成鲜明对比。
translated by 谷歌翻译
近似贝叶斯计算(ABC)是贝叶斯推理的一种方法,当可能性不可用时,但是可以从模型中进行模拟。然而,许多ABC算法需要大量的模拟,这可能是昂贵的。为了降低计算成本,已经提出了贝叶斯优化(BO)和诸如高斯过程的模拟模型。贝叶斯优化使人们可以智能地决定在哪里评估模型下一个,但是常见的BO策略不是为了估计后验分布而设计的。我们的论文解决了文献中的这一差距。我们建议计算ABC后验密度的不确定性,这是因为缺乏模拟来准确估计这个数量,并且定义了测量这种不确定性的aloss函数。然后,我们建议选择下一个评估位置,以尽量减少预期的损失。实验表明,与普通BO策略相比,所提出的方法通常产生最准确的近似。
translated by 谷歌翻译
We develop parallel predictive entropy search (PPES), a novel algorithm for Bayesian optimization of expensive black-box objective functions. At each iteration , PPES aims to select a batch of points which will maximize the information gain about the global maximizer of the objective. Well known strategies exist for suggesting a single evaluation point based on previous observations, while far fewer are known for selecting batches of points to evaluate in parallel. The few batch selection schemes that have been studied all resort to greedy methods to compute an optimal batch. To the best of our knowledge, PPES is the first non-greedy batch Bayesian optimization strategy. We demonstrate the benefit of this approach in optimization performance on both synthetic and real world applications , including problems in machine learning, rocket science and robotics.
translated by 谷歌翻译
We propose minimum regret search (MRS), a novel acquisition function for Bayesian optimization. MRS bears similarities with information-theoretic approaches such as en-tropy search (ES). However, while ES aims in each query at maximizing the information gain with respect to the global maximum, MRS aims at minimizing the expected simple regret of its ultimate recommendation for the optimum. While empirically ES and MRS perform similar in most of the cases, MRS produces fewer out-liers with high simple regret than ES. We provide empirical results both for a synthetic single-task optimization problem as well as for a simulated multi-task robotic control problem.
translated by 谷歌翻译
贝叶斯优化和Lipschitz优化已经开发出用于优化黑盒功能的替代技术。它们各自利用关于函数的不同形式的先验。在这项工作中,我们探索了这些技术的策略,以便更好地进行全局优化。特别是,我们提出了在传统BO算法中使用Lipschitz连续性假设的方法,我们称之为Lipschitz贝叶斯优化(LBO)。这种方法不会增加渐近运行时间,并且在某些情况下会大大提高性能(而在最坏的情况下,性能类似)。实际上,在一个特定的环境中,我们证明使用Lipschitz信息产生与后悔相同或更好的界限,而不是单独使用贝叶斯优化。此外,我们提出了一个简单的启发式方法来估计Lipschitz常数,并证明Lipschitz常数的增长估计在某种意义上是“无害的”。我们对具有4个采集函数的15个数据集进行的实验表明,在最坏的情况下,LBO的表现类似于底层BO方法,而在某些情况下,它的表现要好得多。特别是汤普森采样通常看到了极大的改进(因为Lipschitz信息已经得到了很好的修正) - 探索“现象”及其LBO变体通常优于其他采集功能。
translated by 谷歌翻译
Bandit methods for black-box optimisation, such as Bayesian optimisation, are used in a variety of applications including hyper-parameter tuning and experiment design. Recently, multi-fidelity methods have garnered considerable attention since function evaluations have become increasingly expensive in such applications. Multi-fidelity methods use cheap approximations to the function of interest to speed up the overall opti-misation process. However, most multi-fidelity methods assume only a finite number of approximations. In many practical applications however, a continuous spectrum of approximations might be available. For instance, when tuning an expensive neural network, one might choose to approximate the cross validation performance using less data N and/or few training iterations T. Here, the approximations are best viewed as arising out of a continuous two dimensional space (N, T). In this work, we develop a Bayesian optimisa-tion method, BOCA, for this setting. We char-acterise its theoretical properties and show that it achieves better regret than than strategies which ignore the approximations. BOCA outperforms several other baselines in synthetic and real experiments .
translated by 谷歌翻译
贝叶斯优化(BO)是指用于对昂贵的黑盒函数进行全局优化的一套技术,它使用函数的内省贝叶斯模型来有效地找到最优值。虽然BO已经在许多应用中成功应用,但现代优化任务迎来了传统方法失败的新挑战。在这项工作中,我们展示了Dragonfly,这是一个开源Python库,用于可扩展和强大的BO.Dragonfly包含多个最近开发的方法,允许BO应用于具有挑战性的现实世界环境;这些包括更好的处理更高维域的方法,当昂贵函数的廉价近似可用时处理多保真评估的方法,优化结构化组合空间的方法,例如神经网络架构的空间,以及处理并行评估的方法。此外,我们在BO中开发了新的方法改进,用于选择贝叶斯模型,选择采集函数,以及优化具有不同变量类型和附加约束的过复杂域。我们将Dragonfly与一套用于全局优化的其他软件包和算法进行比较,并证明当上述方法集成时,它们可以显着改善BO的性能。 Dragonfly图书馆可在dragonfly.github.io上找到。
translated by 谷歌翻译
基于高斯过程模型的贝叶斯优化(BO)是优化评估成本昂贵的黑盒函数的有力范例。虽然几个BO算法可证明地收敛到未知函数的全局最优,但他们认为内核的超参数是已知的。在实践中情况并非如此,并且错误指定经常导致这些算法收敛到较差的局部最优。在本文中,我们提出了第一个BO算法,它可以证明是无后悔的,并且在不参考超参数的情况下收敛到最优。我们慢慢地调整了固定核的超参数,从而扩展了相关的函数类超时,使BO算法考虑了更复杂的函数候选。基于理论上的见解,我们提出了几种实用的算法,通过在线超参数估计来实现BO的经验数据效率,但是保留理论收敛保证。我们评估了几个基准问题的方法。
translated by 谷歌翻译
Bayesian optimization with Gaussian processes has become an increasingly popular tool in the machine learning community. It is efficient and can be used when very little is known about the objective function, making it popular in expensive black-box optimization scenarios. It uses Bayesian methods to sample the objective efficiently using an acquisition function which incorporates the posterior estimate of the objective. However, there are several different parameterized acquisition functions in the literature, and it is often unclear which one to use. Instead of using a single acquisition function, we adopt a portfolio of acquisition functions governed by an online multi-armed bandit strategy. We propose several portfolio strategies, the best of which we call GP-Hedge, and show that this method outperforms the best individual acquisition function. We also provide a theoretical bound on the algorithm's performance .
translated by 谷歌翻译
最近,人们越来越关注贝叶斯优化 - 一种未知函数的优化,其假设通常由高斯过程(GP)先前表示。我们研究了一种直接使用函数argmax估计的优化策略。该策略提供了实践和理论上的优势:不需要选择权衡参数,而且,我们建立与流行的GP-UCB和GP-PI策略的紧密联系。我们的方法可以被理解为自动和自适应地在GP-UCB和GP-PI中进行勘探和利用。我们通过对遗憾的界限以及对机器人和视觉任务的广泛经验评估来说明这种自适应调整的效果,展示了该策略对一系列性能标准的稳健性。
translated by 谷歌翻译
We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments-active user modelling with preferences, and hierarchical reinforcement learning-and a discussion of the pros and cons of Bayesian optimization based on our experiences.
translated by 谷歌翻译
随机实验是评估变化对现实世界系统影响的黄金标准。这些测试中的数据可能难以收集,结果可能具有高度差异,从而导致潜在的大量测量误差。贝叶斯优化是一种有效优化多个连续参数的有前途的技术,但是当噪声水平高时,现有方法降低了性能,限制了其对多个随机实验的适用性。我们得到了一个表达式,用于预期的改进,具有噪声观察和噪声约束的批量优化,并开发了一种准蒙特卡罗近似,可以有效地进行优化。使用合成函数进行的仿真表明,噪声约束问题的优化性能优于现有方法。我们通过在Facebook上进行的两个真实的实验来进一步证明该方法的有效性:优化排名系统和优化服务器编译器标志。
translated by 谷歌翻译
In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural networks in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm-the parallel knowledge gradient method. By construction, this method provides the one-step Bayes optimal batch of points to sample. We provide an efficient strategy for computing this Bayes-optimal batch of points, and we demonstrate that the parallel knowledge gradient method finds global optima significantly faster than previous batch Bayesian optimization algorithms on both synthetic test functions and when tuning hyperparameters of practical machine learning algorithms, especially when function evaluations are noisy.
translated by 谷歌翻译
In this paper, we analyze a generic algorithm scheme for sequential global optimization using Gaussian processes. The upper bounds we derive on the cumulative regret for this generic algorithm improve by an exponential factor the previously known bounds for algorithms like GP-UCB. We also introduce the novel Gaussian Process Mutual Information algorithm (GP-MI), which significantly improves further these upper bounds for the cumulative regret. We confirm the efficiency of this algorithm on synthetic and real tasks against the natural competitor, GP-UCB, and also the Expected Improvement heuristic.
translated by 谷歌翻译
Bayesian Optimisation (BO) is a technique used in optimising a$D$-dimensional function which is typically expensive to evaluate. While therehave been many successes for BO in low dimensions, scaling it to highdimensions has been notoriously difficult. Existing literature on the topic areunder very restrictive settings. In this paper, we identify two key challengesin this endeavour. We tackle these challenges by assuming an additive structurefor the function. This setting is substantially more expressive and contains aricher class of functions than previous work. We prove that, for additivefunctions the regret has only linear dependence on $D$ even though the functiondepends on all $D$ dimensions. We also demonstrate several other statisticaland computational benefits in our framework. Via synthetic examples, ascientific simulation and a face detection problem we demonstrate that ourmethod outperforms naive BO on additive functions and on several examples wherethe function is not additive.
translated by 谷歌翻译
We consider Bayesian methods for multi-information source optimization (MISO), in which we seek to optimize an expensive-to-evaluate black-box objective function while also accessing cheaper but biased and noisy approximations ("information sources"). We present a novel algorithm that outperforms the state of the art for this problem by using a Gaussian process covariance kernel better suited to MISO than those used by previous approaches, and an acquisition function based on a one-step optimality analysis supported by efficient parallelization. We also provide a novel technique to guarantee the asymptotic quality of the solution provided by this algorithm. Experimental evaluations demonstrate that this algorithm consistently finds designs of higher value at less cost than previous approaches.
translated by 谷歌翻译
我们应用数值方法结合有限差分时域(FDTD)模拟,利用新颖的多保真高斯过程方法,利用五维参数空间上的多目标品质因数优化等离子体镜面滤色器的传输特性。我们将这些结果与传统的无导数全局搜索算法进行比较,例如(单保真)高斯过程优化方案和粒子群优化 - 纳米光子学社区中常用的方法,这是在Lumerical商业光子学软件中实现的。我们在几个预先收集的现实数据集上展示了各种数值优化方法的性能,并表明通过廉价模拟适当地交易廉价信息源,可以更有效地优化具有固定预算的传输属性。
translated by 谷歌翻译