Estimating how uncertain an AI system is in its predictions is important to improve the safety of such systems. Uncertainty in predictive can result from uncertainty in model parameters, irreducible data uncertainty and uncertainty due to distributional mismatch between the test and training data distributions. Different actions might be taken depending on the source of the uncertainty so it is important to be able to distinguish between them. Recently, baseline tasks and metrics have been defined and several practical methods to estimate uncertainty developed. These methods, however, attempt to model uncertainty due to distributional mismatch either implicitly through model uncertainty or as data uncertainty. This work proposes a new framework for modeling predictive uncertainty called Prior Networks (PNs) which explicitly models distributional uncertainty. PNs do this by parameterizing a prior distribution over predictive distributions. This work focuses on uncertainty for classification and evaluates PNs on the tasks of identifying out-of-distribution (OOD) samples and detecting misclassification on the MNIST and CIFAR-10 datasets, where they are found to outperform previous methods. Experiments on synthetic and MNIST data show that unlike previous non-Bayesian methods PNs are able to distinguish between data and distributional uncertainty.
translated by 谷歌翻译
在深神经网络中量化预测性不确定性的流行方法通常涉及一组权重或模型,例如通过合并或蒙特卡罗辍学。这些技术通常必须产生开销,必须培训多种模型实例,或者不会产生非常多样化的预测。该调查旨在熟悉基于证据深度学习的概念的替代类模型的读者:对于不熟悉的数据,他们承认“他们不知道的内容”并返回到先前的信仰。此外,它们允许在单个模型中进行不确定性估计,并通过参数化分布分布来转发传递。该调查重新承认现有工作,重点是在分类设置中的实现。最后,我们调查了相同范例的应用到回归问题。我们还对现有的方法进行了反思,并与现有方法相比,并提供最大的核心理论成果,以便通知未来的研究。
translated by 谷歌翻译
It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures. Existing methods mainly resolve this issue by retraining the entire model to impose the uncertainty quantification capability so that the learned model can achieve desired performance in accuracy and uncertainty prediction simultaneously. However, training the model from scratch is computationally expensive and may not be feasible in many situations. In this work, we consider a more practical post-hoc uncertainty learning setting, where a well-trained base model is given, and we focus on the uncertainty quantification task at the second stage of training. We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities, which is effective and computationally efficient. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties and easily adapt to different application settings, including out-of-domain data detection, misclassification detection, and trustworthy transfer learning. We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications over multiple representative image classification benchmarks.
translated by 谷歌翻译
We introduce ensembles of stochastic neural networks to approximate the Bayesian posterior, combining stochastic methods such as dropout with deep ensembles. The stochastic ensembles are formulated as families of distributions and trained to approximate the Bayesian posterior with variational inference. We implement stochastic ensembles based on Monte Carlo dropout, DropConnect and a novel non-parametric version of dropout and evaluate them on a toy problem and CIFAR image classification. For CIFAR, the stochastic ensembles are quantitatively compared to published Hamiltonian Monte Carlo results for a ResNet-20 architecture. We also test the quality of the posteriors directly against Hamiltonian Monte Carlo simulations in a simplified toy model. Our results show that in a number of settings, stochastic ensembles provide more accurate posterior estimates than regular deep ensembles.
translated by 谷歌翻译
随着我们远离数据,预测不确定性应该增加,因为各种各样的解释与鲜为人知的信息一致。我们引入了远距离感知的先验(DAP)校准,这是一种纠正训练域之外贝叶斯深度学习模型过度自信的方法。我们将DAPS定义为模型参数的先验分布,该模型参数取决于输入,通过其与训练集的距离度量。DAP校准对后推理方法不可知,可以作为后处理步骤进行。我们证明了其在各种分类和回归问题中对几个基线的有效性,包括旨在测试远离数据的预测分布质量的基准。
translated by 谷歌翻译
We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). While most works in the literature that use GANs to generate out-of-distribution (OoD) examples only focus on the evaluation of OoD detection, we present a GAN based approach to learn a classifier that produces proper uncertainties for OoD examples as well as for false positives (FPs). Instead of shielding the entire in-distribution data with GAN generated OoD examples which is state-of-the-art, we shield each class separately with out-of-class examples generated by a conditional GAN and complement this with a one-vs-all image classifier. In our experiments, in particular on CIFAR10, CIFAR100 and Tiny ImageNet, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers. Furthermore, we also find that the generated GAN examples do not significantly affect the calibration error of our classifier and result in a significant gain in model accuracy.
translated by 谷歌翻译
不确定性的量化对于采用机器学习至关重要,尤其是拒绝分布(OOD)数据回到人类专家进行审查。然而,进步一直很慢,因为计算效率和不确定性估计质量之间必须达到平衡。因此,许多人使用神经网络或蒙特卡洛辍学的深层集合来进行相对最小的计算和记忆时合理的不确定性估计。出乎意料的是,当我们专注于$ \ leq 1 \%$ frese-falds正率(FPR)的现实世界中的约束时,先前的方法无法可靠地检测到OOD样本。值得注意的是,即使高斯随机噪声也无法触发这些流行的OOD技术。我们通过设计一种简单的对抗训练计划来帮助缓解这个问题,该计划结合了辍学合奏所预测的认知不确定性的攻击。我们证明了这种方法可以改善标准数据(即未经对抗制作)上的OOD检测性能,并将标准化的部分AUC从近乎随机的猜测性能提高到$ \ geq 0.75 $。
translated by 谷歌翻译
Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles. However their practicality in real-time, industrial-scale applications are limited due to the high memory and computational cost. Furthermore, ensembles and BNNs do not necessarily fix all the issues with the underlying member networks. In this work, we study principled approaches to improve uncertainty property of a single network, based on a single, deterministic representation. By formalizing the uncertainty quantification as a minimax learning problem, we first identify distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs with two simple changes: (1) applying spectral normalization to hidden weights to enforce bi-Lipschitz smoothness in representations and (2) replacing the last output layer with a Gaussian process layer. On a suite of vision and language understanding benchmarks, SNGP outperforms other single-model approaches in prediction, calibration and out-of-domain detection. Furthermore, SNGP provides complementary benefits to popular techniques such as deep ensembles and data augmentation, making it a simple and scalable building block for probabilistic deep learning. Code is open-sourced at https://github.com/google/uncertainty-baselines
translated by 谷歌翻译
独立训练的神经网络的集合是一种最新的方法,可以在深度学习中估算预测性不确定性,并且可以通过三角洲函数的混合物解释为后验分布的近似值。合奏的培训依赖于损失景观的非跨性别性和其单个成员的随机初始化,从而使后近似不受控制。本文提出了一种解决此限制的新颖和原则性的方法,最大程度地减少了函数空间中真实后验和内核密度估计器(KDE)之间的$ f $ divergence。我们从组合的角度分析了这一目标,并表明它在任何$ f $的混合组件方面都是supporular。随后,我们考虑了贪婪合奏结构的问题。从负$ f $ didivergence上的边际增益来量化后近似的改善,通过将新组件添加到KDE中得出,我们得出了集合方法的新型多样性项。我们的方法的性能在计算机视觉的分布外检测基准测试中得到了证明,该基准在多个数据集中训练的一系列架构中。我们方法的源代码可在https://github.com/oulu-imeds/greedy_ensembles_training上公开获得。
translated by 谷歌翻译
在图像分类中,在检测分布(OOD)数据时发生了许多发展。但是,大多数OOD检测方法是在一组标准数据集上评估的,该数据集与培训数据任意不同。没有明确的定义``好的''ood数据集。此外,最先进的OOD检测方法已经在这些标准基准上取得了几乎完美的结果。在本文中,我们定义了2类OOD数据使用与分布(ID)数据的感知/视觉和语义相似性的微妙概念。我们将附近的OOD样本定义为感知上相似但语义上与ID样本的不同,并将样本转移为视觉上不同但在语义上与ID相似的点数据。然后,我们提出了一个基于GAN的框架,用于从这两个类别中生成OOD样品,给定一个ID数据集。通过有关MNIST,CIFAR-10/100和Imagenet的广泛实验,我们表明A)在常规基准上表现出色的ART OOD检测方法对我们提出的基准测试的稳健性明显较小。 N基准测试,反之亦然,因此表明甚至可能不需要单独的OOD集来可靠地评估OOD检测中的性能。
translated by 谷歌翻译
The problem of detecting the Out-of-Distribution (OoD) inputs is of paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable and often tend to make over-confident predictions for OoDs, assigning to them a higher density than to the in-distribution data. This over-confidence in a single model can be potentially mitigated with Bayesian inference over the model parameters that take into account epistemic uncertainty. This paper investigates three approaches to Bayesian inference: stochastic gradient Markov chain Monte Carlo, Bayes by Backpropagation, and Stochastic Weight Averaging-Gaussian. The inference is implemented over the weights of the deep neural networks that parameterize the likelihood of the Variational Autoencoder. We empirically evaluate the approaches against several benchmarks that are often used for OoD detection: estimation of the marginal likelihood utilizing sampled model ensemble, typicality test, disagreement score, and Watanabe-Akaike Information Criterion. Finally, we introduce two simple scores that demonstrate the state-of-the-art performance.
translated by 谷歌翻译
最近,深度学习中的不确定性估计已成为提高安全至关重要应用的可靠性和鲁棒性的关键领域。尽管有许多提出的方法要么关注距离感知模型的不确定性,要么是分布式检测的不确定性,要么是针对分布校准的输入依赖性标签不确定性,但这两种类型的不确定性通常都是必要的。在这项工作中,我们提出了用于共同建模模型和数据不确定性的HETSNGP方法。我们表明,我们提出的模型在这两种类型的不确定性之间提供了有利的组合,因此在包括CIFAR-100C,ImagEnet-C和Imagenet-A在内的一些具有挑战性的分发数据集上优于基线方法。此外,我们提出了HETSNGP Ensemble,这是我们方法的结合版本,该版本还对网络参数的不确定性进行建模,并优于其他集合基线。
translated by 谷歌翻译
最近出现了一系列用于估计具有单个正向通行证的深神经网络中的认知不确定性的新方法,最近已成为贝叶斯神经网络的有效替代方法。在信息性表示的前提下,这些确定性不确定性方法(DUM)在检测到分布(OOD)数据的同时在推理时添加可忽略的计算成本时实现了强大的性能。但是,目前尚不清楚dums是否经过校准,可以无缝地扩展到现实世界的应用 - 这都是其实际部署的先决条件。为此,我们首先提供了DUMS的分类法,并在连续分配转移下评估其校准。然后,我们将它们扩展到语义分割。我们发现,尽管DUMS尺度到现实的视觉任务并在OOD检测方面表现良好,但当前方法的实用性受到分配变化下的校准不良而破坏的。
translated by 谷歌翻译
部署在医学成像任务上的机器学习模型必须配备分布外检测功能,以避免错误的预测。不确定依赖于深神经网络的分布外检测模型是否适合检测医学成像中的域移位。高斯流程可以通过其数学结构可靠地与分布数据点可靠地分开分发数据点。因此,我们为分层卷积高斯工艺提出了一个参数有效的贝叶斯层,该过程融合了在Wasserstein-2空间中运行的高斯过程,以可靠地传播不确定性。这直接用远距离的仿射操作员在分布中直接取代了高斯流程。我们对脑组织分割的实验表明,所得的架构接近了确定性分割算法(U-NET)的性能,而先前的层次高斯过程尚未实现。此外,通过将相同的分割模型应用于分布外数据(即具有病理学(例如脑肿瘤)的图像),我们表明我们的不确定性估计导致分布外检测,以优于以前的贝叶斯网络和以前的贝叶斯网络的功能基于重建的方法学习规范分布。为了促进未来的工作,我们的代码公开可用。
translated by 谷歌翻译
机器学习模型通常会遇到与训练分布不同的样本。无法识别分布(OOD)样本,因此将该样本分配给课堂标签会显着损害模​​型的可靠性。由于其对在开放世界中的安全部署模型的重要性,该问题引起了重大关注。由于对所有可能的未知分布进行建模的棘手性,检测OOD样品是具有挑战性的。迄今为止,一些研究领域解决了检测陌生样本的问题,包括异常检测,新颖性检测,一级学习,开放式识别识别和分布外检测。尽管有相似和共同的概念,但分别分布,开放式检测和异常检测已被独立研究。因此,这些研究途径尚未交叉授粉,创造了研究障碍。尽管某些调查打算概述这些方法,但它们似乎仅关注特定领域,而无需检查不同领域之间的关系。这项调查旨在在确定其共同点的同时,对各个领域的众多著名作品进行跨域和全面的审查。研究人员可以从不同领域的研究进展概述中受益,并协同发展未来的方法。此外,据我们所知,虽然进行异常检测或单级学习进行了调查,但没有关于分布外检测的全面或最新的调查,我们的调查可广泛涵盖。最后,有了统一的跨域视角,我们讨论并阐明了未来的研究线,打算将这些领域更加紧密地融为一体。
translated by 谷歌翻译
Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in NNs is a challenging and yet unsolved problem. Bayesian NNs, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modifications to the training procedure and are computationally expensive compared to standard (non-Bayesian) NNs. We propose an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are as good or better than approximate Bayesian NNs. To assess robustness to dataset shift, we evaluate the predictive uncertainty on test examples from known and unknown distributions, and show that our method is able to express higher uncertainty on out-of-distribution examples. We demonstrate the scalability of our method by evaluating predictive uncertainty estimates on ImageNet.
translated by 谷歌翻译
变形自身偏移(VAES)是具有来自深神经网络架构和贝叶斯方法的丰富代表功能的有影响力的生成模型。然而,VAE模型具有比分布(ID)输入的分配方式分配更高的可能性较高的可能性。为了解决这个问题,认为可靠的不确定性估计是对对OOC投入的深入了解至关重要。在这项研究中,我们提出了一种改进的噪声对比之前(INCP),以便能够集成到VAE的编码器中,称为INCPVAE。INCP是可扩展,可培训和与VAE兼容的,它还采用了来自INCP的优点进行不确定性估计。各种数据集的实验表明,与标准VAE相比,我们的模型在OOD数据的不确定性估计方面是优越的,并且在异常检测任务中是强大的。INCPVAE模型获得了可靠的输入不确定性估算,并解决了VAE模型中的ood问题。
translated by 谷歌翻译
Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive uncertainty. Quantifying uncertainty is especially critical in real-world settings, which often involve input distributions that are shifted from the training distribution due to a variety of factors including sample bias and non-stationarity. In such settings, well calibrated uncertainty estimates convey information about when a model's output should (or should not) be trusted. Many probabilistic deep learning methods, including Bayesian-and non-Bayesian methods, have been proposed in the literature for quantifying predictive uncertainty, but to our knowledge there has not previously been a rigorous largescale empirical comparison of these methods under dataset shift. We present a largescale benchmark of existing state-of-the-art methods on classification problems and investigate the effect of dataset shift on accuracy and calibration. We find that traditional post-hoc calibration does indeed fall short, as do several other previous methods. However, some methods that marginalize over models give surprisingly strong results across a broad spectrum of tasks.
translated by 谷歌翻译
深度神经网络具有令人印象深刻的性能,但是他们无法可靠地估计其预测信心,从而限制了其在高风险领域中的适用性。我们表明,应用多标签的一VS损失揭示了分类的歧义并降低了模型的过度自信。引入的Slova(单标签One-Vs-All)模型重新定义了单个标签情况的典型单VS-ALL预测概率,其中只有一个类是正确的答案。仅当单个类具有很高的概率并且其他概率可忽略不计时,提议的分类器才有信心。与典型的SoftMax函数不同,如果所有其他类的概率都很小,Slova自然会检测到分布的样本。该模型还通过指数校准进行了微调,这使我们能够与模型精度准确地对齐置信分数。我们在三个任务上验证我们的方法。首先,我们证明了斯洛伐克与最先进的分布校准具有竞争力。其次,在数据集偏移下,斯洛伐克的性能很强。最后,我们的方法在检测到分布样品的检测方面表现出色。因此,斯洛伐克是一种工具,可以在需要不确定性建模的各种应用中使用。
translated by 谷歌翻译
Acquiring labeled data is challenging in many machine learning applications with limited budgets. Active learning gives a procedure to select the most informative data points and improve data efficiency by reducing the cost of labeling. The info-max learning principle maximizing mutual information such as BALD has been successful and widely adapted in various active learning applications. However, this pool-based specific objective inherently introduces a redundant selection and further requires a high computational cost for batch selection. In this paper, we design and propose a new uncertainty measure, Balanced Entropy Acquisition (BalEntAcq), which captures the information balance between the uncertainty of underlying softmax probability and the label variable. To do this, we approximate each marginal distribution by Beta distribution. Beta approximation enables us to formulate BalEntAcq as a ratio between an augmented entropy and the marginalized joint entropy. The closed-form expression of BalEntAcq facilitates parallelization by estimating two parameters in each marginal Beta distribution. BalEntAcq is a purely standalone measure without requiring any relational computations with other data points. Nevertheless, BalEntAcq captures a well-diversified selection near the decision boundary with a margin, unlike other existing uncertainty measures such as BALD, Entropy, or Mean Standard Deviation (MeanSD). Finally, we demonstrate that our balanced entropy learning principle with BalEntAcq consistently outperforms well-known linearly scalable active learning methods, including a recently proposed PowerBALD, a simple but diversified version of BALD, by showing experimental results obtained from MNIST, CIFAR-100, SVHN, and TinyImageNet datasets.
translated by 谷歌翻译