当代预测模型很难解释,因为他们的深网利用了输入要素之间的许多复杂关系。这项工作通过测量相关特征对网络相对于输入的功能熵的贡献,提出了模型可解释性的理论框架。我们依赖于对数 - 索波列夫的不等式,该不平等是通过功能性渔民信息与数据的协方差界定功能熵的。这提供了一种衡量特征子集对决策功能的信息贡献的原则方法。通过广泛的实验,我们表明我们的方法超过了基于图像,文本和音频等各种数据信号的现有基于基于可解释性抽样的方法。
translated by 谷歌翻译
Large speech emotion recognition datasets are hard to obtain, and small datasets may contain biases. Deep-net-based classifiers, in turn, are prone to exploit those biases and find shortcuts such as speaker characteristics. These shortcuts usually harm a model's ability to generalize. To address this challenge, we propose a gradient-based adversary learning framework that learns a speech emotion recognition task while normalizing speaker characteristics from the feature representation. We demonstrate the efficacy of our method on both speaker-independent and speaker-dependent settings and obtain new state-of-the-art results on the challenging IEMOCAP dataset.
translated by 谷歌翻译
深度神经网络的成功严重依赖于他们在其投入和其产出之间编码复杂关系的能力。虽然此属性适用于培训数据,但它也掩盖了驱动预测的机制。本研究旨在通过采用基于离散变分的自动化器来改变预测类的干预机制来揭示隐藏的概念。然后,解释模型从任何隐藏层和相应的介入表示可视化编码信息。通过评估原始代表与介入代表之间的差异,可以确定可以改变该类的概念,从而提供可解释性。我们展示了我们在Celeba上的方法的有效性,在那里我们对数据中的偏见显示了各种可视化,并建议揭示和改变偏见的不同干预措施。
translated by 谷歌翻译
We present 3D Highlighter, a technique for localizing semantic regions on a mesh using text as input. A key feature of our system is the ability to interpret "out-of-domain" localizations. Our system demonstrates the ability to reason about where to place non-obviously related concepts on an input 3D shape, such as adding clothing to a bare 3D animal model. Our method contextualizes the text description using a neural field and colors the corresponding region of the shape using a probability-weighted blend. Our neural optimization is guided by a pre-trained CLIP encoder, which bypasses the need for any 3D datasets or 3D annotations. Thus, 3D Highlighter is highly flexible, general, and capable of producing localizations on a myriad of input shapes. Our code is publicly available at https://github.com/threedle/3DHighlighter.
translated by 谷歌翻译
近年来,多级预测广受欢迎。因此,衡量拟合度的善良成为研究人员经常必须处理的基本问题。几个指标通常用于此任务。但是,当人们必须决定正确的测量值时,他必须考虑不同的用例施加了控制这一决定的不同约束。至少在\ emph {现实世界}多级问题中的主要约束是不平衡的数据:多类问题几乎无法提供对称数据。因此,当我们观察到常见的KPI(关键性能指标)时,例如精度敏感性或准确性时,很少会将所获得的数字解释为模型的实际需求。我们建议将Matthew的相关系数概括为多维。该概括基于对广义混淆矩阵的几何解释。
translated by 谷歌翻译
通过滚动式摄像机获得的视频导致空间延伸的帧。在快速相机/场景动作下,这些扭曲变得很重要。 RS的撤消效果有时被称为空间问题,需要对象进行整流/流离失所,以生成其正确的全局快门(GS)帧。但是,RS效应的原因是固有的,而不是空间。在本文中,我们为RS问题提出了一个时空解决方案。我们观察到,尽管它们的XY帧,RS视频及其相应的GS视频之间存在严重差异,但往往共享完全相同的XT片 - 直到已知的子帧时间变化。此外,尽管每个视频中都有强烈的时间别名,但它们共享相同的小型2D XT-Patches的分布。这允许使用RS输入视频施加的视频特定约束来限制GS输出视频。我们的算法由3个主要组成部分组成:(i)使用现成方法(通过常规视频序列训练)在连续的RS帧之间进行密集的时间上采样,从中我们提取GS“建议”。 (ii)学习使用专用Mergenet正确合并此类GS的“建议”。 (iii)特定于视频的零拍优化,该优化构成了GS输出视频和RS输入视频之间XT-Patches的相似性。我们的方法在基准数据集上获得了最新的结果,尽管在小型合成RS/GS数据集上进行了培训,但在数值和视觉上都获得了最新结果。此外,它可以很好地概括到具有运动类型的新的复杂RS视频(例如,复杂的非刚性动作)之外的运动类型 - 竞争对更多数据训练的竞争方法的视频无法很好地处理。我们将这些概括功能归因于外部和内部约束的组合。
translated by 谷歌翻译
EC-KITY是用于执行进化计算(EC)的全面Python库,根据GNU通用公共许可证v3.0许可,并与Scikit-Learn兼容。考虑到现代软件工程和机器学习集成,EC-KITY可以支持所有流行的EC范式,包括遗传算法,遗传编程,协同进化,进化多目标优化等等。本文概述了该软件包的概述,包括设置EC实验,体系结构,主要功能以及与其他库的比较的便利性。
translated by 谷歌翻译
DNA存储的概念最早是在1959年由谁分享关于在谈话“有足够的空间在底部”纳米技术他的远见理查德·费曼建议。后来,对20世纪的结束,在基于DNA分子的存储解决方案的兴趣是随着人类基因组计划这反过来又导致了测序和组装方法显著进步的结果。 DNA存储在成熟的磁和光存储解决方案中享有重大优势。相对于磁性溶液,DNA存储不需要电力供应,以保持数据的完整性和优于在密度和耐用性的存储解决方案。鉴于趋势成本DNA合成和测序的降低,现在承认,在未来10 - 15年DNA存储内可能会成为一个高度竞争的归档技术,可能以后的主要这样的技术。随着中说,基于DNA的存储系统的当前实施方式是非常有限,并且不完全优化解决表征合成和测序过程错误的独特图案。在这项工作中,我们提出了一个强大,高效且可扩展的解决方案,以实现基于DNA的存储系统。我们的方法其部署重建的字母基于通过合成和测序过程中产生的拷贝不完善群集上的序列深神经网络(DNN)。特制的纠错码(ECC)被用来在此过程中发生的错误的作战模式。由于我们的重建方法适于不完善簇,我们的方法允许使用一种快速,可扩展的伪聚类而不是克服了嘈杂DNA拷贝聚类处理时的瓶颈。我们的回旋和变压器块和使用真实数据统计仿照合成数据进行训练之间架构整合。
translated by 谷歌翻译
Graph Neural Networks (GNNs) are a family of graph networks inspired by mechanisms existing between nodes on a graph. In recent years there has been an increased interest in GNN and their derivatives, i.e., Graph Attention Networks (GAT), Graph Convolutional Networks (GCN), and Graph Recurrent Networks (GRN). An increase in their usability in computer vision is also observed. The number of GNN applications in this field continues to expand; it includes video analysis and understanding, action and behavior recognition, computational photography, image and video synthesis from zero or few shots, and many more. This contribution aims to collect papers published about GNN-based approaches towards computer vision. They are described and summarized from three perspectives. Firstly, we investigate the architectures of Graph Neural Networks and their derivatives used in this area to provide accurate and explainable recommendations for the ensuing investigations. As for the other aspect, we also present datasets used in these works. Finally, using graph analysis, we also examine relations between GNN-based studies in computer vision and potential sources of inspiration identified outside of this field.
translated by 谷歌翻译
Anomaly analytics is a popular and vital task in various research contexts, which has been studied for several decades. At the same time, deep learning has shown its capacity in solving many graph-based tasks like, node classification, link prediction, and graph classification. Recently, many studies are extending graph learning models for solving anomaly analytics problems, resulting in beneficial advances in graph-based anomaly analytics techniques. In this survey, we provide a comprehensive overview of graph learning methods for anomaly analytics tasks. We classify them into four categories based on their model architectures, namely graph convolutional network (GCN), graph attention network (GAT), graph autoencoder (GAE), and other graph learning models. The differences between these methods are also compared in a systematic manner. Furthermore, we outline several graph-based anomaly analytics applications across various domains in the real world. Finally, we discuss five potential future research directions in this rapidly growing field.
translated by 谷歌翻译