Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity surrogate modeling strategy in which the multi-fidelity surrogate is assembled using an active learning strategy using an on-the-fly model adequacy assessment set within a subset simulation framework for efficient reliability analysis. The multi-fidelity surrogate is assembled by first applying a Gaussian process correction to each low-fidelity model and assigning a model probability based on the model's local predictive accuracy and cost. Three strategies are proposed to fuse these individual surrogates into an overall surrogate model based on model averaging and deterministic/stochastic model selection. The strategies also dictate which model evaluations are necessary. No assumptions are made about the relationships between low-fidelity models, while the high-fidelity model is assumed to be the most accurate and most computationally expensive model. Through two analytical and two numerical case studies, including a case study evaluating the failure probability of Tristructural isotropic-coated (TRISO) nuclear fuels, the algorithm is shown to be highly accurate while drastically reducing the number of high-fidelity model calls (and hence computational cost).
translated by 谷歌翻译
TRISTRUCCUCTIONATIOPIC(TRISO)涂层颗粒燃料是强大的核燃料,并确定其可靠性对于先进的核技术的成功至关重要。然而,Triso失效概率很小,相关的计算模型很昂贵。我们使用耦合的主动学习,多尺度建模和子集模拟来估计使用几个1D和2D模型的Triso燃料的故障概率。通过多尺度建模,我们用来自两个低保真(LF)模型的信息融合,取代了昂贵的高保真(HF)模型评估。对于1D TRISO模型,我们考虑了三种多倍性建模策略:仅克里格,Kriging LF预测加克里格校正,深神经网络(DNN)LF预测加克里格校正。虽然这些多尺度建模策略的结果令人满意地比较了从两个LF模型中使用信息融合的策略,但是通常常常称为HF模型。接下来,对于2D Triso模型,我们考虑了两个多倍性建模策略:DNN LF预测加克里格校正(数据驱动)和1D Triso LF预测加克里格校正(基于物理学)。正如所预期的那样,基于物理的策略一直需要对HF模型的最少的呼叫。然而,由于DNN预测是瞬时的,数据驱动的策略具有较低的整体模拟时间,并且1D Triso模型需要不可忽略的模拟时间。
translated by 谷歌翻译
本文为工程产品的计算模型或仅返回分类信息的过程提供了一种新的高效和健壮方法,用于罕见事件概率估计,例如成功或失败。对于此类模型,大多数用于估计故障概率的方法,这些方法使用结果的数值来计算梯度或估计与故障表面的接近度。即使性能函数不仅提供了二进制输出,系统的状态也可能是连续输入变量域中定义的不平滑函数,甚至是不连续的函数。在这些情况下,基于经典的梯度方法通常会失败。我们提出了一种简单而有效的算法,该算法可以从随机变量的输入域进行顺序自适应选择点,以扩展和完善简单的基于距离的替代模型。可以在连续采样的任何阶段完成两个不同的任务:(i)估计失败概率,以及(ii)如果需要进一步改进,则选择最佳的候选者进行后续模型评估。选择用于模型评估的下一个点的建议标准最大化了使用候选者分类的预期概率。因此,全球探索与本地剥削之间的完美平衡是自动维持的。该方法可以估计多种故障类型的概率。此外,当可以使用模型评估的数值来构建平滑的替代物时,该算法可以容纳此信息以提高估计概率的准确性。最后,我们定义了一种新的简单但一般的几何测量,这些测量是对稀有事实概率对单个变量的全局敏感性的定义,该度量是作为所提出算法的副产品获得的。
translated by 谷歌翻译
We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.
translated by 谷歌翻译
This paper presents a surrogate modelling technique based on domain partitioning for Bayesian parameter inference of highly nonlinear engineering models. In order to alleviate the computational burden typically involved in Bayesian inference applications, a multielement Polynomial Chaos Expansion based Kriging metamodel is proposed. The developed surrogate model combines in a piecewise function an array of local Polynomial Chaos based Kriging metamodels constructed on a finite set of non-overlapping subdomains of the stochastic input space. Therewith, the presence of non-smoothness in the response of the forward model (e.g.~ nonlinearities and sparseness) can be reproduced by the proposed metamodel with minimum computational costs owing to its local adaptation capabilities. The model parameter inference is conducted through a Markov chain Monte Carlo approach comprising adaptive exploration and delayed rejection. The efficiency and accuracy of the proposed approach are validated through two case studies, including an analytical benchmark and a numerical case study. The latter relates the partial differential equation governing the hydrogen diffusion phenomenon of metallic materials in Thermal Desorption Spectroscopy tests.
translated by 谷歌翻译
Explicitly accounting for uncertainties is paramount to the safety of engineering structures. Optimization which is often carried out at the early stage of the structural design offers an ideal framework for this task. When the uncertainties are mainly affecting the objective function, robust design optimization is traditionally considered. This work further assumes the existence of multiple and competing objective functions that need to be dealt with simultaneously. The optimization problem is formulated by considering quantiles of the objective functions which allows for the combination of both optimality and robustness in a single metric. By introducing the concept of common random numbers, the resulting nested optimization problem may be solved using a general-purpose solver, herein the non-dominated sorting genetic algorithm (NSGA-II). The computational cost of such an approach is however a serious hurdle to its application in real-world problems. We therefore propose a surrogate-assisted approach using Kriging as an inexpensive approximation of the associated computational model. The proposed approach consists of sequentially carrying out NSGA-II while using an adaptively built Kriging model to estimate the quantiles. Finally, the methodology is adapted to account for mixed categorical-continuous parameters as the applications involve the selection of qualitative design parameters as well. The methodology is first applied to two analytical examples showing its efficiency. The third application relates to the selection of optimal renovation scenarios of a building considering both its life cycle cost and environmental impact. It shows that when it comes to renovation, the heating system replacement should be the priority.
translated by 谷歌翻译
社会和自然中的极端事件,例如大流行尖峰,流氓波浪或结构性失败,可能会带来灾难性的后果。极端的表征很困难,因为它们很少出现,这似乎是由良性的条件引起的,并且属于复杂且通常是未知的无限维系统。这种挑战使他们将其描述为“毫无意义”。我们通过将贝叶斯实验设计(BED)中的新型训练方案与深神经操作员(DNOS)合奏结合在一起来解决这些困难。这个模型不足的框架配对了一个床方案,该床方案积极选择数据以用近似于无限二二维非线性运算符的DNO集合来量化极端事件。我们发现,这个框架不仅清楚地击败了高斯流程(GPS),而且只有两个成员的浅色合奏表现最好; 2)无论初始数据的状态如何(即有或没有极端),都会发现极端; 3)我们的方法消除了“双研究”现象; 4)与逐步全球Optima相比,使用次优的采集点的使用不会阻碍床的性能; 5)蒙特卡洛的获取优于高量级的标准优化器。这些结论共同构成了AI辅助实验基础设施的基础,该基础设施可以有效地推断并查明从物理到社会系统的许多领域的关键情况。
translated by 谷歌翻译
多保真建模和校准是在工程设计中普遍出现的数据融合任务。在本文中,我们介绍了一种基于潜在地图高斯过程(LMGPS)的新方法,可实现高效准确的数据融合。在我们的方法中,我们将数据融合转换为潜在的空间学习问题,其中自动学习不同数据源之间的关系。这种转换赋予我们的方法具有有吸引力的优点,例如提高准确性,降低成本,灵活性,共同熔断任何数量的数据源,以及可视化数据源之间的相关性。该可视化允许用户通过拟合LMGP仅拟合到具有良好相关的数据源的子集的子集来检测模型形式误差或确定用于高保真仿真的最佳策略。我们还开发了一种新的内核功能,使LMGPS能够不仅构建概率的多保真代理,而且还具有高精度和一致性的估计参数。与现有技术相比,我们的方法的实施和使用易于更简单,更不容易出现数值问题。我们通过在广泛的示例中比较其对竞争方法的性能来证明基于LMGP的数据融合的好处。
translated by 谷歌翻译
在这项工作中,我们提出了一个新的高斯进程回归(GPR)方法:物理信息辅助Kriging(PHIK)。在标准数据驱动的Kriging中,感兴趣的未知功能通常被视为高斯过程,其中具有假定的静止协方差,其具有从数据估计的QuandEdmente。在PHIK中,我们从可用随机模型的实现中计算平均值和协方差函数,例如,从管理随机部分微分方程解决方案的实现。这种构造的高斯过程通常是非静止的,并且不承担特定形式的协方差。我们的方法避免了数据驱动的GPR方法中的优化步骤来识别超参数。更重要的是,我们证明了确定性线性操作员形式的物理约束在得到的预测中保证。当在随机模型实现中包含错误时,我们还提供了保留物理约束时的误差估计。为了降低获取随机模型的计算成本,我们提出了一种多级蒙特卡罗估计的平均和协方差函数。此外,我们介绍了一种有源学习算法,指导选择附加观察位置。 PHIK的效率和准确性被证明重建部分已知的修饰的Branin功能,研究三维传热问题,并从稀疏浓度测量学习保守的示踪剂分布。
translated by 谷歌翻译
In a fissile material, the inherent multiplicity of neutrons born through induced fissions leads to correlations in their detection statistics. The correlations between neutrons can be used to trace back some characteristics of the fissile material. This technique known as neutron noise analysis has applications in nuclear safeguards or waste identification. It provides a non-destructive examination method for an unknown fissile material. This is an example of an inverse problem where the cause is inferred from observations of the consequences. However, neutron correlation measurements are often noisy because of the stochastic nature of the underlying processes. This makes the resolution of the inverse problem more complex since the measurements are strongly dependent on the material characteristics. A minor change in the material properties can lead to very different outputs. Such an inverse problem is said to be ill-posed. For an ill-posed inverse problem the inverse uncertainty quantification is crucial. Indeed, seemingly low noise in the data can lead to strong uncertainties in the estimation of the material properties. Moreover, the analytical framework commonly used to describe neutron correlations relies on strong physical assumptions and is thus inherently biased. This paper addresses dual goals. Firstly, surrogate models are used to improve neutron correlations predictions and quantify the errors on those predictions. Then, the inverse uncertainty quantification is performed to include the impact of measurement error alongside the residual model bias.
translated by 谷歌翻译
由于其数据效率,贝叶斯优化已经出现在昂贵的黑盒优化的最前沿。近年来,关于新贝叶斯优化算法及其应用的发展的研究激增。因此,本文试图对贝叶斯优化的最新进展进行全面和更新的调查,并确定有趣的开放问题。我们将贝叶斯优化的现有工作分为九个主要群体,并根据所提出的算法的动机和重点。对于每个类别,我们介绍了替代模型的构建和采集功能的适应的主要进步。最后,我们讨论了开放的问题,并提出了有希望的未来研究方向,尤其是在分布式和联合优化系统中的异质性,隐私保护和公平性方面。
translated by 谷歌翻译
我们考虑使用昂贵的功能评估(也称为实验)的黑匣子多目标优化(MOO)的问题,其中目标是通过最小化实验的总资源成本来近似真正的帕累托解决方案。例如,在硬件设计优化中,我们需要使用昂贵的计算模拟找到权衡性能,能量和面积开销的设计。关键挑战是选择使用最小资源揭示高质量解决方案的实验顺序。在本文中,我们提出了一种基于输出空间熵(OSE)搜索原理来解决MOO问题的一般框架:选择最大化每单位资源成本的信息的实验,这是真正的帕累托前线所获得的信息。我们适当地实例化了OSE搜索的原理,以导出以下四个Moo问题设置的高效算法:1)最基本的EM单一保真设置,实验昂贵且准确; 2)处理EM黑匣子约束}在不执行实验的情况下无法进行评估; 3)离散的多保真设置,实验可以在消耗的资源量和评估准确度时变化; 4)EM连续保真设置,其中连续函数近似导致巨大的实验空间。不同综合和现实世界基准测试的实验表明,基于OSE搜索的算法在既有计算效率和MOO解决方案的准确性方面改进了最先进的方法。
translated by 谷歌翻译
Surrogate models have shown to be an extremely efficient aid in solving engineering problems that require repeated evaluations of an expensive computational model. They are built by sparsely evaluating the costly original model and have provided a way to solve otherwise intractable problems. A crucial aspect in surrogate modelling is the assumption of smoothness and regularity of the model to approximate. This assumption is however not always met in reality. For instance in civil or mechanical engineering, some models may present discontinuities or non-smoothness, e.g., in case of instability patterns such as buckling or snap-through. Building a single surrogate model capable of accounting for these fundamentally different behaviors or discontinuities is not an easy task. In this paper, we propose a three-stage approach for the approximation of non-smooth functions which combines clustering, classification and regression. The idea is to split the space following the localized behaviors or regimes of the system and build local surrogates that are eventually assembled. A sequence of well-known machine learning techniques are used: Dirichlet process mixtures models (DPMM), support vector machines and Gaussian process modelling. The approach is tested and validated on two analytical functions and a finite element model of a tensile membrane structure.
translated by 谷歌翻译
基于采样的推理技术是现代宇宙学数据分析的核心;然而,这些方法与维度不良,通常需要近似或顽固的可能性。在本文中,我们描述了截短的边际神经比率估计(TMNRE)(即所谓的基于模拟的推断的新方法)自然避免了这些问题,提高了$(i)$效率,$(ii)$可扩展性和$ (iii)推断后的后续后续的可信度。使用宇宙微波背景(CMB)的测量,我们表明TMNRE可以使用比传统马尔可夫链蒙特卡罗(MCMC)方法更少模拟器呼叫的数量级来实现融合的后海后。值得注意的是,所需数量的样本有效地独立于滋扰参数的数量。此外,称为\ MEMPH {本地摊销}的属性允许对基于采样的方法无法访问的严格统计一致性检查的性能。 TMNRE承诺成为宇宙学数据分析的强大工具,特别是在扩展宇宙学的背景下,其中传统的基于采样的推理方法所需的时间级数融合可以大大超过$ \ Lambda $ CDM等简单宇宙学模型的时间。为了执行这些计算,我们使用开源代码\ texttt {swyft}来使用TMNRE的实现。
translated by 谷歌翻译
The saddle point (SP) calculation is a grand challenge for computationally intensive energy function in computational chemistry area, where the saddle point may represent the transition state (TS). The traditional methods need to evaluate the gradients of the energy function at a very large number of locations. To reduce the number of expensive computations of the true gradients, we propose an active learning framework consisting of a statistical surrogate model, Gaussian process regression (GPR) for the energy function, and a single-walker dynamics method, gentle accent dynamics (GAD), for the saddle-type transition states. SP is detected by the GAD applied to the GPR surrogate for the gradient vector and the Hessian matrix. Our key ingredient for efficiency improvements is an active learning method which sequentially designs the most informative locations and takes evaluations of the original model at these locations to train GPR. We formulate this active learning task as the optimal experimental design problem and propose a very efficient sample-based sub-optimal criterion to construct the optimal locations. We show that the new method significantly decreases the required number of energy or force evaluations of the original model.
translated by 谷歌翻译
大多数机器学习算法由一个或多个超参数配置,必须仔细选择并且通常会影响性能。为避免耗时和不可递销的手动试验和错误过程来查找性能良好的超参数配置,可以采用各种自动超参数优化(HPO)方法,例如,基于监督机器学习的重新采样误差估计。本文介绍了HPO后,本文审查了重要的HPO方法,如网格或随机搜索,进化算法,贝叶斯优化,超带和赛车。它给出了关于进行HPO的重要选择的实用建议,包括HPO算法本身,性能评估,如何将HPO与ML管道,运行时改进和并行化结合起来。这项工作伴随着附录,其中包含关于R和Python的特定软件包的信息,以及用于特定学习算法的信息和推荐的超参数搜索空间。我们还提供笔记本电脑,这些笔记本展示了这项工作的概念作为补充文件。
translated by 谷歌翻译
贝叶斯优化提供了一种优化昂贵黑匣子功能的有效方法。它最近已应用于流体动力学问题。本文研究并在一系列合成测试函数上从经验上比较了常见的贝叶斯优化算法。它研究了采集函数和训练样本数量的选择,采集功能的精确计算以及基于蒙特卡洛的方法以及单点和多点优化。该测试功能被认为涵盖了各种各样的挑战,因此是理想的测试床,以了解贝叶斯优化的性能,并确定贝叶斯优化表现良好和差的一般情况。这些知识可以用于应用程序中,包括流体动力学的知识,这些知识是未知的。这项调查的结果表明,要做出的选择与相对简单的功能不相关,而乐观的采集功能(例如上限限制)应首选更复杂的目标函数。此外,蒙特卡洛方法的结果与分析采集函数的结果相当。在目标函数允许并行评估的情况下,多点方法提供了更快的替代方法,但它可能需要进行更多的客观函数评估。
translated by 谷歌翻译
我们提出了一种非常重要的抽样方法,该方法适用于估计高维问题中的罕见事件概率。我们将一般重要性抽样问题中的最佳重要性分布近似为在订单保留转换组成下的参考分布的推动力,在这种转换的组成下,每种转换都是由平方的张量训练 - 培训分解形成的。平方张量训练的分解提供了可扩展的ANSATZ,用于通过密度近似值来构建具有订单的高维转换。沿着一系列桥接密度移动的地图组成的使用减轻了直接近似浓缩密度函数的难度。为了计算对非规范概率分布的期望,我们设计了一个比率估计器,该比率估计器使用单独的重要性分布估算归一化常数,这再次通过张量训练格式的转换组成构建。与自称的重要性抽样相比,这提供了更好的理论差异,因此为贝叶斯推理问题中罕见事件概率的有效计算打开了大门。关于受微分方程约束的问题的数值实验显示,计算复杂性几乎没有增加,事件概率将零,并允许对迄今为止对复杂,高维后密度的罕见事件概率的迄今无法获得的估计。
translated by 谷歌翻译
Bayesian optimization (BO) is increasingly employed in critical applications such as materials design and drug discovery. An increasingly popular strategy in BO is to forgo the sole reliance on high-fidelity data and instead use an ensemble of information sources which provide inexpensive low-fidelity data. The overall premise of this strategy is to reduce the overall sampling costs by querying inexpensive low-fidelity sources whose data are correlated with high-fidelity samples. Here, we propose a multi-fidelity cost-aware BO framework that dramatically outperforms the state-of-the-art technologies in terms of efficiency, consistency, and robustness. We demonstrate the advantages of our framework on analytic and engineering problems and argue that these benefits stem from our two main contributions: (1) we develop a novel acquisition function for multi-fidelity cost-aware BO that safeguards the convergence against the biases of low-fidelity data, and (2) we tailor a newly developed emulator for multi-fidelity BO which enables us to not only simultaneously learn from an ensemble of multi-fidelity datasets, but also identify the severely biased low-fidelity sources that should be excluded from BO.
translated by 谷歌翻译
We propose a novel model agnostic data-driven reliability analysis framework for time-dependent reliability analysis. The proposed approach -- referred to as MAntRA -- combines interpretable machine learning, Bayesian statistics, and identifying stochastic dynamic equation to evaluate reliability of stochastically-excited dynamical systems for which the governing physics is \textit{apriori} unknown. A two-stage approach is adopted: in the first stage, an efficient variational Bayesian equation discovery algorithm is developed to determine the governing physics of an underlying stochastic differential equation (SDE) from measured output data. The developed algorithm is efficient and accounts for epistemic uncertainty due to limited and noisy data, and aleatoric uncertainty because of environmental effect and external excitation. In the second stage, the discovered SDE is solved using a stochastic integration scheme and the probability failure is computed. The efficacy of the proposed approach is illustrated on three numerical examples. The results obtained indicate the possible application of the proposed approach for reliability analysis of in-situ and heritage structures from on-site measurements.
translated by 谷歌翻译