解释深层神经网络做出的决定是一个快速发展的研究主题。近年来,几种方法试图提供有关为结构化2D图像输入数据设计的神经网络做出决定的视觉解释。在本文中,我们提出了一种新颖的方法,以生成旨在对非结构化3D数据(即点云)进行分类的网络的粗略视觉解释。我们的方法使用流回到最终特征映射层的梯度并将这些值映射为输入点云中相应点的贡献。由于维数分歧和输入点之间缺乏空间一致性,我们的方法将梯度与点下降相结合以计算点云的不同部分的解释。我们方法的一般性在各种点云分类网络上进行了测试,包括“单一对象”网络PointNet,PointNet ++,DGCNN和“场景”网络投票。我们的方法生成对称解释图,该图突出显示了重要区域,并提供了对网络体系结构决策过程的见解。我们对使用定量,定量和人类研究的比较方法对解释方法的信任和解释性进行了详尽的评估。我们所有的代码均在Pytorch中实施,并将公开可用。
translated by 谷歌翻译
Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multi-layer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels wrt the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012 and MIT Places data sets. Our main result is that the recently proposed Layer-wise Relevance Propagation (LRP) algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of neural network performance.
translated by 谷歌翻译
采用注意机制的普遍性引起了人们对注意力分布的解释性的关注。尽管它提供了有关模型如何运行的见解,但由于对模型预测的解释仍然非常怀疑,但它利用了注意力。社区仍在寻求更容易解释的策略,以更好地识别最终决定最大的本地活跃地区。为了提高现有注意模型的解释性,我们提出了一种新型的双线性代表性非参数注意(BR-NPA)策略,该策略捕获了与任务相关的人类解剖信息。目标模型首先要蒸馏以具有高分辨率中间特征图。然后,根据本地成对特征相似性将代表性特征分组,以产生更精确的,更精确的注意力图,突出显示输入的任务相关部分。获得的注意图根据化合物特征的活性水平进行对,该功能提供了有关突出显示区域的重要水平的信息。提出的模型可以很容易地在涉及分类的各种现代深层模型中进行调整。与最先进的注意力模型和可视化方法相比,广泛的定量和定性实验显示了更全面和准确的视觉解释,以及跨多个任务的可视化方法,包括细粒度的图像分类,很少的射击分类和人重新识别,而无需损害该方法分类精度。提出的可视化模型急切地阐明了神经网络如何在不同任务中以不同的方式“注意他们的注意力”。
translated by 谷歌翻译
We propose a technique for producing 'visual explanations' for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable.Our approach -Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept (say 'dog' in a classification network or a sequence of words in captioning network) flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept.Unlike previous approaches, Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fullyconnected layers (e.g. VGG), (2) CNNs used for structured outputs (e.g. captioning), (3) CNNs used in tasks with multimodal inputs (e.g. visual question answering) or reinforcement learning, all without architectural changes or re-training. We combine Grad-CAM with existing fine-grained visualizations to create a high-resolution class-discriminative vi-
translated by 谷歌翻译
Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to wellinformed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.
translated by 谷歌翻译
Explainable AI transforms opaque decision strategies of ML models into explanations that are interpretable by the user, for example, identifying the contribution of each input feature to the prediction at hand. Such explanations, however, entangle the potentially multiple factors that enter into the overall complex decision strategy. We propose to disentangle explanations by finding relevant subspaces in activation space that can be mapped to more abstract human-understandable concepts and enable a joint attribution on concepts and input features. To automatically extract the desired representation, we propose new subspace analysis formulations that extend the principle of PCA and subspace analysis to explanations. These novel analyses, which we call principal relevant component analysis (PRCA) and disentangled relevant subspace analysis (DRSA), optimize relevance of projected activations rather than the more traditional variance or kurtosis. This enables a much stronger focus on subspaces that are truly relevant for the prediction and the explanation, in particular, ignoring activations or concepts to which the prediction model is invariant. Our approach is general enough to work alongside common attribution techniques such as Shapley Value, Integrated Gradients, or LRP. Our proposed methods show to be practically useful and compare favorably to the state of the art as demonstrated on benchmarks and three use cases.
translated by 谷歌翻译
卷积神经网络(CNN)以其出色的功能提取能力而闻名,可以从数据中学习模型,但被用作黑匣子。对卷积滤液和相关特征的解释可以帮助建立对CNN的理解,以区分各种类别。在这项工作中,我们关注的是CNN模型的解释性,称为CNNexplain,该模型用于COVID-19和非CoVID-19分类,重点是卷积过滤器的特征解释性,以及这些功能如何有助于分类。具体而言,我们使用了各种可解释的人工智能(XAI)方法,例如可视化,SmoothGrad,Grad-Cam和Lime来提供卷积滤液的解释及相关特征及其在分类中的作用。我们已经分析了使用干咳嗽光谱图的这些方法的解释。从石灰,光滑果实和GRAD-CAM获得的解释结果突出了不同频谱图的重要特征及其与分类的相关性。
translated by 谷歌翻译
Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions. In this paper, we develop a novel post-hoc visual explanation method called Score-CAM based on class activation mapping. Unlike previous class activation mapping based approaches, Score-CAM gets rid of the dependence on gradients by obtaining the weight of each activation map through its forward passing score on target class, the final result is obtained by a linear combination of weights and activation maps. We demonstrate that Score-CAM achieves better visual performance and fairness for interpreting the decision making process. Our approach outperforms previous methods on both recognition and localization tasks, it also passes the sanity check. We also indicate its application as debugging tools. The implementation is available 1 .
translated by 谷歌翻译
这项调查回顾了对基于视觉的自动驾驶系统进行行为克隆训练的解释性方法。解释性的概念具有多个方面,并且需要解释性的驾驶强度是一种安全至关重要的应用。从几个研究领域收集贡献,即计算机视觉,深度学习,自动驾驶,可解释的AI(X-AI),这项调查可以解决几点。首先,它讨论了从自动驾驶系统中获得更多可解释性和解释性的定义,上下文和动机,以及该应用程序特定的挑战。其次,以事后方式为黑盒自动驾驶系统提供解释的方法是全面组织和详细的。第三,详细介绍和讨论了旨在通过设计构建更容易解释的自动驾驶系统的方法。最后,确定并检查了剩余的开放挑战和潜在的未来研究方向。
translated by 谷歌翻译
最近的机器学习趋势一直是通过解释自己的预测的能力来丰富学习模式。到目前为止,迄今为止,可解释的AI(XAI)的新兴领域主要集中在监督学习,特别是深度神经网络分类器。然而,在许多实际问题中,未给出标签信息,并且目标是发现数据的基础结构,例如,其群集。虽然存在强大的方法来提取数据中的群集结构,但它们通常不会回答为什么已分配给给定群集的某些数据点的原因。我们提出了一种新的框架,它首次以有效可靠的方式在输入特征方面解释群集分配。它基于小说洞察力,即聚类模型可以被重写为神经网络 - 或“神经化”。然后,所获得的网络的集群预测可以快速准确地归因于输入特征。几个陈列室展示了我们的方法评估学习集群质量的能力,并从分析的数据和表示中提取新颖的见解。
translated by 谷歌翻译
可解释的人工智能(XAI)的新兴领域旨在为当今强大但不透明的深度学习模型带来透明度。尽管本地XAI方法以归因图的形式解释了个体预测,从而确定了重要特征的发生位置(但没有提供有关其代表的信息),但全局解释技术可视化模型通常学会的编码的概念。因此,两种方法仅提供部分见解,并留下将模型推理解释的负担。只有少数当代技术旨在将本地和全球XAI背后的原则结合起来,以获取更多信息的解释。但是,这些方法通常仅限于特定的模型体系结构,或对培训制度或数据和标签可用性施加其他要求,这实际上使事后应用程序成为任意预训练的模型。在这项工作中,我们介绍了概念相关性传播方法(CRP)方法,该方法结合了XAI的本地和全球观点,因此允许回答“何处”和“ where”和“什么”问题,而没有其他约束。我们进一步介绍了相关性最大化的原则,以根据模型对模型的有用性找到代表性的示例。因此,我们提高了对激活最大化及其局限性的共同实践的依赖。我们证明了我们方法在各种环境中的能力,展示了概念相关性传播和相关性最大化导致了更加可解释的解释,并通过概念图表,概念组成分析和概念集合和概念子区和概念子区和概念子集和定量研究对模型的表示和推理提供了深刻的见解。它们在细粒度决策中的作用。
translated by 谷歌翻译
State-of-the-art object detectors are treated as black boxes due to their highly non-linear internal computations. Even with unprecedented advancements in detector performance, the inability to explain how their outputs are generated limits their use in safety-critical applications. Previous work fails to produce explanations for both bounding box and classification decisions, and generally make individual explanations for various detectors. In this paper, we propose an open-source Detector Explanation Toolkit (DExT) which implements the proposed approach to generate a holistic explanation for all detector decisions using certain gradient-based explanation methods. We suggests various multi-object visualization methods to merge the explanations of multiple objects detected in an image as well as the corresponding detections in a single image. The quantitative evaluation show that the Single Shot MultiBox Detector (SSD) is more faithfully explained compared to other detectors regardless of the explanation methods. Both quantitative and human-centric evaluations identify that SmoothGrad with Guided Backpropagation (GBP) provides more trustworthy explanations among selected methods across all detectors. We expect that DExT will motivate practitioners to evaluate object detectors from the interpretability perspective by explaining both bounding box and classification decisions.
translated by 谷歌翻译
Deep neural networks are being used increasingly to automate data analysis and decision making, yet their decision-making process is largely unclear and is difficult to explain to the end users. In this paper, we address the problem of Explainable AI for deep neural networks that take images as input and output a class probability. We propose an approach called RISE that generates an importance map indicating how salient each pixel is for the model's prediction. In contrast to white-box approaches that estimate pixel importance using gradients or other internal network state, RISE works on blackbox models. It estimates importance empirically by probing the model with randomly masked versions of the input image and obtaining the corresponding outputs. We compare our approach to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments. Extensive experiments on several benchmark datasets show that our approach matches or exceeds the performance of other methods, including white-box approaches.
translated by 谷歌翻译
深层神经网络以其对各种机器学习和人工智能任务的精湛处理而闻名。但是,由于其过度参数化的黑盒性质,通常很难理解深层模型的预测结果。近年来,已经提出了许多解释工具来解释或揭示模型如何做出决策。在本文中,我们回顾了这一研究,并尝试进行全面的调查。具体来说,我们首先介绍并阐明了人们通常会感到困惑的两个基本概念 - 解释和解释性。为了解决解释中的研究工作,我们通过提出新的分类法来阐述许多解释算法的设计。然后,为了了解解释结果,我们还调查了评估解释算法的性能指标。此外,我们总结了使用“可信赖”解释算法评估模型的解释性的当前工作。最后,我们审查并讨论了深层模型的解释与其他因素之间的联系,例如对抗性鲁棒性和从解释中学习,并介绍了一些开源库,以解释算法和评估方法。
translated by 谷歌翻译
近年来,可解释的人工智能(XAI)已成为一个非常适合的框架,可以生成人类对“黑盒”模型的可理解解释。在本文中,一种新颖的XAI视觉解释算法称为相似性差异和唯一性(SIDU)方法,该方法可以有效地定位负责预测的整个对象区域。通过各种计算和人类主题实验分析了SIDU算法的鲁棒性和有效性。特别是,使用三种不同类型的评估(应用,人类和功能地面)评估SIDU算法以证明其出色的性能。在对“黑匣子”模型的对抗性攻击的情况下,进一步研究了Sidu的鲁棒性,以更好地了解其性能。我们的代码可在:https://github.com/satyamahesh84/sidu_xai_code上找到。
translated by 谷歌翻译
Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective. Empirically, it shows strong performance on par or even better than state of the art. Theoretically, we provide analysis towards understanding of what the network has learnt and why the network is robust with respect to input perturbation and corruption.
translated by 谷歌翻译
由于深度学习在放射学领域被广泛使用,因此在使用模型进行诊断时,这种模型的解释性越来越成为获得临床医生的信任至关重要的。在这项研究中,使用U-NET架构进行了三个实验集,以改善分类性能,同时通过在训练过程中结合热图生成器来增强与模型相对应的热图。所有实验均使用包含胸部X光片的数据集,来自三个条件之一(“正常”,“充血性心力衰竭(CHF)”和“肺炎”)的相关标签,以及有关放射科医师眼神坐标的数值信息在图像上。引入该数据集的论文(A. Karargyris和Moradi,2021年)开发了一个U-NET模型,该模型被视为这项研究的基线模型,以显示如何将眼目光数据用于多模式培训中的眼睛凝视数据以进行多模式培训以进行多模式训练。解释性改进。为了比较分类性能,测量了接收器操作特征曲线(AUC)下面积的95%置信区间(CI)。最佳方法的AUC为0.913(CI:0.860-0.966)。最大的改进是“肺炎”和“ CHF”类别,基线模型最努力地进行分类,导致AUCS 0.859(CI:0.732-0.957)和0.962(CI:0.933-0.989)。所提出的方法的解码器还能够产生概率掩模,以突出模型分类中确定的图像部分,类似于放射科医生的眼睛凝视数据。因此,这项工作表明,将热图发生器和眼睛凝视信息纳入训练可以同时改善疾病分类,并提供可解释的视觉效果,与放射线医生在进行诊断时如何看待胸部X光片。
translated by 谷歌翻译
您将如何通过一些错过来修复物理物体?您可能会想象它的原始形状从先前捕获的图像中,首先恢复其整体(全局)但粗大的形状,然后完善其本地细节。我们有动力模仿物理维修程序以解决点云完成。为此,我们提出了一个跨模式的形状转移双转化网络(称为CSDN),这是一种带有全循环参与图像的粗到精细范式,以完成优质的点云完成。 CSDN主要由“ Shape Fusion”和“ Dual-Refinect”模块组成,以应对跨模式挑战。第一个模块将固有的形状特性从单个图像传输,以指导点云缺失区域的几何形状生成,在其中,我们建议iPadain嵌入图像的全局特征和部分点云的完成。第二个模块通过调整生成点的位置来完善粗糙输出,其中本地改进单元通过图卷积利用了小说和输入点之间的几何关系,而全局约束单元则利用输入图像来微调生成的偏移。与大多数现有方法不同,CSDN不仅探讨了图像中的互补信息,而且还可以在整个粗到精细的完成过程中有效利用跨模式数据。实验结果表明,CSDN对十个跨模式基准的竞争对手表现出色。
translated by 谷歌翻译
Post-hoc analysis is a popular category in eXplainable artificial intelligence (XAI) study. In particular, methods that generate heatmaps have been used to explain the deep neural network (DNN), a black-box model. Heatmaps can be appealing due to the intuitive and visual ways to understand them but assessing their qualities might not be straightforward. Different ways to assess heatmaps' quality have their own merits and shortcomings. This paper introduces a synthetic dataset that can be generated adhoc along with the ground-truth heatmaps for more objective quantitative assessment. Each sample data is an image of a cell with easily recognized features that are distinguished from localization ground-truth mask, hence facilitating a more transparent assessment of different XAI methods. Comparison and recommendations are made, shortcomings are clarified along with suggestions for future research directions to handle the finer details of select post-hoc analysis methods. Furthermore, mabCAM is introduced as the heatmap generation method compatible with our ground-truth heatmaps. The framework is easily generalizable and uses only standard deep learning components.
translated by 谷歌翻译
随着深度神经网络的兴起,解释这些网络预测的挑战已经越来越识别。虽然存在许多用于解释深度神经网络的决策的方法,但目前没有关于如何评估它们的共识。另一方面,鲁棒性是深度学习研究的热门话题;但是,在最近,几乎没有谈论解释性。在本教程中,我们首先呈现基于梯度的可解释性方法。这些技术使用梯度信号来分配对输入特征的决定的负担。后来,我们讨论如何为其鲁棒性和对抗性的鲁棒性在具有有意义的解释中扮演的作用来评估基于梯度的方法。我们还讨论了基于梯度的方法的局限性。最后,我们提出了在选择解释性方法之前应检查的最佳实践和属性。我们结束了未来在稳健性和解释性融合的地区研究的研究。
translated by 谷歌翻译