As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
With the rapid deployment of graph neural networks (GNNs) based techniques into a wide range of applications such as link prediction, node classification, and graph classification the explainability of GNNs has become an indispensable component for predictive and trustworthy decision-making. Thus, it is critical to explain why graph neural network (GNN) makes particular predictions for them to be believed in many applications. Some GNNs explainers have been proposed recently. However, they lack to generate accurate and real explanations. To mitigate these limitations, we propose GANExplainer, based on Generative Adversarial Network (GAN) architecture. GANExplainer is composed of a generator to create explanations and a discriminator to assist with the Generator development. We investigate the explanation accuracy of our models by comparing the performance of GANExplainer with other state-of-the-art methods. Our empirical results on synthetic datasets indicate that GANExplainer improves explanation accuracy by up to 35\% compared to its alternatives.
translated by 谷歌翻译
图形神经网络(GNN)已证明图形数据的预测性能显着提高。同时,这些模型的预测通常很难解释。在这方面,已经做出了许多努力来从gnnexplainer,XGNN和PGEXPlainer等角度解释这些模型的预测机制。尽管这样的作品呈现出系统的框架来解释GNN,但对于可解释的GNN的整体评论是不可用的。在这项调查中,我们介绍了针对GNN开发的解释性技术的全面综述。我们专注于可解释的图形神经网络,并根据可解释方法的使用对它们进行分类。我们进一步为GNNS解释提供了共同的性能指标,并指出了几个未来的研究指标。
translated by 谷歌翻译
基于深度学习的模型占主导地位的生产推荐系统的当前景观。此外,近年来目睹了模型规模的指数增长 - 从谷歌的2016年模型,最新的Facebook的型号有10亿个参数,具有12万亿参数。型号容量的每次跳跃都有显着的质量增强,这使我们相信100万亿参数的时代即将来临。然而,即使在工业规模数据中心内,这些模型的培训也在挑战。这种困难是从训练计算的惊人的异质性继承 - 模型的嵌入层可以包括总模型尺寸的99.99%,这是极其内存密集的;虽然其余的神经网络越来越多地计算密集型。为支持培训此类巨大模式,迫切需要有效的分布式培训系统。在本文中,我们通过仔细共同设计优化算法和分布式系统架构来解决这一挑战。具体而言,为了确保培训效率和训练精度,我们设计一种新型混合训练算法,其中嵌入层和密集的神经网络由不同的同步机制处理;然后,我们构建一个名为Persia的系统(短暂的并行推荐培训系统,其中包含混合加速),以支持这种混合培训算法。理论上的示范和实证研究均达到100万亿参数,以证明了波斯的系统设计和实施。我们将Pensia公开使用(在https://github.com/persiamml/persia),以便任何人都能够以100万亿参数的规模轻松培训推荐模型。
translated by 谷歌翻译
Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.
translated by 谷歌翻译
多年来,旨在从已知事实中推断出新结论的知识图(KGS)的推理主要集中在静态KG上。现实生活中知识的不断增长提出了使能够扩大KGS的归纳推理能力的必要性。现有的归纳工作假设新实体都在批处理中一次出现,这过度简化了新实体不断出现的实际情况。这项研究探讨了一个更现实,更具挑战性的环境,新实体分为多批次。我们提出了一个基于步行的归纳推理模型来解决新环境。具体而言,具有自适应关系聚合的图形卷积网络旨在使用其邻近关系编码和更新实体。为了捕捉不同的邻居的重要性,我们在聚合过程中采用了一种查询反馈注意机制。此外,为了减轻新实体的稀疏链接问题,我们提出了一种链接增强策略,以将可信赖的事实添加到KGS中。我们构建了三个新数据集,用于模拟此多批次出现方案。实验结果表明,我们所提出的模型优于基于最先进的基于嵌入的,基于步行的基于步行和基于规则的模型。
translated by 谷歌翻译
实体对齐是知识图(kg)集成中的基本且至关重要的技术。多年来,对实体一致性的研究一直存在于KG是静态的假设,该假设忽略了现实世界KG的生长本质。随着KG的成长,先前的一致性结果面临需要重新审视的,而新实体对齐等待被发现。在本文中,我们建议并深入研究现实但未开发的设置,称为持续实体对齐。为了避免在新实体和三元组来时对整个KGS进行整个模型,我们为此任务提供了一种持续的对齐方法。它基于实体邻接,重建实体的表示,使其能够使用其现有邻居快速而有归纳的新实体生成嵌入。它选择并重播部分预先对准的实体对,仅训练一部分KG,同时提取可信赖的知识对准知识增强。由于不可避免地要包含与以前的作品不同的不可匹配的实体,因此所提出的方法采用双向最近的邻居匹配来找到新的实体对齐并更新旧的对齐。此外,我们还通过模拟多语言dbpedia的增长来构建新数据集。广泛的实验表明,我们的持续比对方法比基于再培训或归纳学习的基准更有效。
translated by 谷歌翻译
Continually learning to segment more and more types of image regions is a desired capability for many intelligent systems. However, such continual semantic segmentation suffers from the same catastrophic forgetting issue as in continual classification learning. While multiple knowledge distillation strategies originally for continual classification have been well adapted to continual semantic segmentation, they only consider transferring old knowledge based on the outputs from one or more layers of deep fully convolutional networks. Different from existing solutions, this study proposes to transfer a new type of information relevant to knowledge, i.e. the relationships between elements (Eg. pixels or small local regions) within each image which can capture both within-class and between-class knowledge. The relationship information can be effectively obtained from the self-attention maps in a Transformer-style segmentation model. Considering that pixels belonging to the same class in each image often share similar visual properties, a class-specific region pooling is applied to provide more efficient relationship information for knowledge transfer. Extensive evaluations on multiple public benchmarks support that the proposed self-attention transfer method can further effectively alleviate the catastrophic forgetting issue, and its flexible combination with one or more widely adopted strategies significantly outperforms state-of-the-art solutions.
translated by 谷歌翻译
这些年来跨域情绪分类一直是一个热点,旨在使用来自源域的标记数据来学习可靠的分类器,并在目标域上进行评估。在此静脉中,大多数方法利用域适应,将数据从不同域映射到共同的特征空间。为了进一步提高模型性能,提出了针对挖掘域特定信息的几种方法。但是,其中大多数仅利用有限的域特定信息。在这项研究中,我们首先通过基于主题信息制定一种提取特定域的单词的方法。然后,我们提出了一个主题驱动的自适应网络(TDAN),用于跨域情绪分类。该网络由两个子网络组成:语义注意网络和域特定的单词注意网络,其结构基于变压器。这些子网采用不同的输入形式,并且它们的输出被融合为特征向量。实验验证了我们TDAN对跨域情绪分类的有效性。
translated by 谷歌翻译
在负面的感知问题中,我们给出了$ n $数据点$({\ boldsymbol x} _i,y_i)$,其中$ {\ boldsymbol x} _i $是$ d $ -densional vector和$ y_i \ in \ { + 1,-1 \} $是二进制标签。数据不是线性可分离的,因此我们满足自己的内容,以找到最大的线性分类器,具有最大的\ emph {否定}余量。换句话说,我们想找到一个单位常规矢量$ {\ boldsymbol \ theta} $,最大化$ \ min_ {i \ le n} y_i \ langle {\ boldsymbol \ theta},{\ boldsymbol x} _i \ rangle $ 。这是一个非凸优化问题(它相当于在Polytope中找到最大标准矢量),我们在两个随机模型下研究其典型属性。我们考虑比例渐近,其中$ n,d \ to \ idty $以$ n / d \ to \ delta $,并在最大边缘$ \ kappa _ {\ text {s}}(\ delta)上证明了上限和下限)$或 - 等效 - 在其逆函数$ \ delta _ {\ text {s}}(\ kappa)$。换句话说,$ \ delta _ {\ text {s}}(\ kappa)$是overparametization阈值:以$ n / d \ le \ delta _ {\ text {s}}(\ kappa) - \ varepsilon $一个分类器实现了消失的训练错误,具有高概率,而以$ n / d \ ge \ delta _ {\ text {s}}(\ kappa)+ \ varepsilon $。我们在$ \ delta _ {\ text {s}}(\ kappa)$匹配,以$ \ kappa \ to - \ idty $匹配。然后,我们分析了线性编程算法来查找解决方案,并表征相应的阈值$ \ delta _ {\ text {lin}}(\ kappa)$。我们观察插值阈值$ \ delta _ {\ text {s}}(\ kappa)$和线性编程阈值$ \ delta _ {\ text {lin {lin}}(\ kappa)$之间的差距,提出了行为的问题其他算法。
translated by 谷歌翻译