图表卷积网络(GCNS)旨在扩展深度学习,以任意不规则域,即图表。它们的成功高度依赖于如何定义输入图的拓扑结构,并且大多数现有的GCN架构依赖于预定义或手工制作的图形结构。在本文中,我们介绍了一种新的方法,该方法将输入图的拓扑(或连接)作为GCN设计的一部分。我们方法的主要贡献驻留在建立正交的连接基础上,以便在实现卷积之前通过其邻居优化节点。我们的方法还考虑了一个时剧性标准,它作为符合规范器,使学习基础和潜在的GCNS轻质,同时仍然非常有效。对基于骨架的手势识别的挑战性任务进行了实验,展示了学习GCNS W.R.T的高效率。相关工作。
translated by 谷歌翻译
学习图形卷积网络(GCNS)是一种新兴领域,其旨在将卷积操作概括为任意非常规域。特别地,与光谱域相比,在空间域操作的GCNS显示出优异的性能,但它们的成功高度依赖于如何定义输入图的拓扑。在本文中,我们向图表卷积网络介绍了一个新颖的框架,了解图形的拓扑属性。我们的方法的设计原理基于约束目标函数的优化,该函数不仅在GCNS中的常用卷积参数中学习,而且是传达这些图中最相关的拓扑关系的转换基础。基于骨架的动作识别的具有挑战性任务进行的实验表明,与手工图形设计以及相关工作相比,所提出的方法的优越性。
translated by 谷歌翻译
光谱图卷积网络(GCNS)是特别的深层模型,其目的在于将神经网络扩展到任意的不规则域。这些网络的原理包括使用Laplacians的特征分解突出图信号,然后在将所产生的滤波信号返回到输入图域之前在光谱域中实现滤波。然而,这些操作的成功高度依赖于主要手工制作的二手拉普拉斯人的相关性,这使得GCN明显次优。在本文中,我们介绍了一种新颖的光谱GCN,不仅可以仅限于通常的卷积参数,而且是拉普拉斯运营商。后者设计了“端到端”作为递归Chebyshev分解的一部分,其特殊性地传送了学习表示的差异和非差异性质 - 随着顺序和辨别力的增加 - 没有过分统计化训练有素的GCN。对基于骨架的动作识别的具有挑战性的任务进行了广泛的实验,展示了我们提出的拉普拉斯设计的泛化能力和表现优惠。不同的基线(建造在手工制作和其他学习的拉普拉斯人)以及相关工作。
translated by 谷歌翻译
In this paper, we design lightweight graph convolutional networks (GCNs) using a particular class of regularizers, dubbed as phase-field models (PFMs). PFMs exhibit a bi-phase behavior using a particular ultra-local term that allows training both the topology and the weight parameters of GCNs as a part of a single "end-to-end" optimization problem. Our proposed solution also relies on a reparametrization that pushes the mask of the topology towards binary values leading to effective topology selection and high generalization while implementing any targeted pruning rate. Both masks and weights share the same set of latent variables and this further enhances the generalization power of the resulting lightweight GCNs. Extensive experiments conducted on the challenging task of skeleton-based recognition show the outperformance of PFMs against other staple regularizers as well as related lightweight design methods.
translated by 谷歌翻译
Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multiscale operators and (2) unobstructed cross-spacetime information flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator named G3D. The proposed multi-scale aggregation scheme disentangles the importance of nodes in different neighborhoods for effective long-range modeling. The proposed G3D module leverages dense cross-spacetime edges as skip connections for direct information propagation across the spatial-temporal graph. By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model 1 outperforms previous state-of-the-art methods on three large-scale datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400.
translated by 谷歌翻译
深度神经网络(DNN)最近在计算机视觉和几个相关领域取得了巨大成功。尽管如此,目前的神经结构仍然遭受灾难性干扰(A.K.A.忘记),这阻碍了DNN不断学习。虽然已经提出了几种最先进的方法来缓解遗忘,但这些现有解决方案是高度僵化的(正则化)或时间/内存要求(作为重播)。在文献中提出了一种基于动态网络的中等方法,并在文献中提出了在任务记忆和计算足迹之间提供合理的平衡。在本文中,我们基于一种基于新颖的无遗忘神经块(FFNB)来设计用于持续学习的动态网络架构。使用新的程序实现新任务的FFNB功能,该程序可以通过在前一个任务的空空间中约束底层参数,而训练分类器参数等同于Fisher判别分析。后者提供了一种有效的增量过程,这也是贝叶斯视角的最佳。使用增量的“端到端”微调进一步增强了训练有素的功能和分类器。在不同具有挑战性的分类问题上进行的大量实验,表明了该方法的高效性。
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
图形卷积网络(GCN)已被证明是一个有力的概念,在过去几年中,已成功应用于许多领域的各种任务。在这项工作中,我们研究了为GCN定义铺平道路的理论,包括经典图理论的相关部分。我们还讨论并在实验上证明了GCN的关键特性和局限性,例如由样品的统计依赖性引起的,该图由图的边缘引入,这会导致完整梯度的估计值偏置。我们讨论的另一个限制是Minibatch采样对模型性能的负面影响。结果,在参数更新期间,在整个数据集上计算梯度,从而破坏了对大图的可扩展性。为了解决这个问题,我们研究了替代方法,这些方法允许在每次迭代中仅采样一部分数据,可以安全地学习良好的参数。我们重现了KIPF等人的工作中报告的结果。并提出一个灵感签名的实现,这是一种无抽样的minibatch方法。最终,我们比较了基准数据集上的两个实现,证明它们在半监督节点分类任务的预测准确性方面是可比的。
translated by 谷歌翻译
图表神经网络(GNNS)在各种机器学习任务中获得了表示学习的提高。然而,应用邻域聚合的大多数现有GNN通常在图中的图表上执行不良,其中相邻的节点属于不同的类。在本文中,我们示出了在典型的异界图中,边缘可以被引导,以及是否像是处理边缘,也可以使它们过度地影响到GNN模型的性能。此外,由于异常的限制,节点对来自本地邻域之外的类似节点的消息非常有益。这些激励我们开发一个自适应地学习图表的方向性的模型,并利用潜在的长距离相关性节点之间。我们首先将图拉普拉斯概括为基于所提出的特征感知PageRank算法向数字化,该算法同时考虑节点之间的图形方向性和长距离特征相似性。然后,Digraph Laplacian定义了一个图形传播矩阵,导致一个名为{\ em diglaciangcn}的模型。基于此,我们进一步利用节点之间的通勤时间测量的节点接近度,以便在拓扑级别上保留节点的远距离相关性。具有不同级别的10个数据集的广泛实验,同意级别展示了我们在节点分类任务任务中对现有解决方案的有效性。
translated by 谷歌翻译
图形卷积网络(GCN)优于基于骨架的人类动作识别领域的先前方法,包括人类的互动识别任务。但是,在处理相互作用序列时,基于GCN的当前方法只需将两人骨架分为两个离散序列,然后以单人动作分类的方式分别执行图形卷积。这种操作忽略了丰富的交互信息,并阻碍了语义模式学习的有效空间关系建模。为了克服上述缺点,我们引入了一个新型的统一的两人图,代表关节之间的空间相互作用相关性。此外,提出了适当设计的图形标记策略,以使我们的GCN模型学习判别时空交互特征。实验显示了使用拟议的两人图形拓扑时的相互作用和单个动作的准确性提高。最后,我们提出了一个两人的图形卷积网络(2P-GCN)。提出的2P-GCN在三个相互作用数据集(SBU,NTU-RGB+D和NTU-RGB+D 120)的四个基准测试基准上获得了最新结果。
translated by 谷歌翻译
Dynamics of human body skeletons convey significant information for human action recognition. Conventional approaches for modeling skeletons usually rely on hand-crafted parts or traversal rules, thus resulting in limited expressive power and difficulties of generalization. In this work, we propose a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data. This formulation not only leads to greater expressive power but also stronger generalization capability. On two large datasets, Kinetics and NTU-RGBD, it achieves substantial improvements over mainstream methods.
translated by 谷歌翻译
我们考虑了从节点观测值估算多个网络拓扑的问题,其中假定这些网络是从相同(未知)随机图模型中绘制的。我们采用图形作为我们的随机图模型,这是一个非参数模型,可以从中绘制出潜在不同大小的图形。图形子的多功能性使我们能够解决关节推理问题,即使对于要恢复的图形包含不同数量的节点并且缺乏整个图形的精确比对的情况。我们的解决方案是基于将最大似然惩罚与Graphon估计方案结合在一起,可用于增强现有网络推理方法。通过引入嘈杂图抽样信息的强大方法,进一步增强了所提出的联合网络和图形估计。我们通过将其性能与合成和实际数据集中的竞争方法进行比较来验证我们提出的方法。
translated by 谷歌翻译
Pre-publication draft of a book to be published byMorgan & Claypool publishers. Unedited version released with permission. All relevant copyrights held by the author and publisher extend to this pre-publication draft.
translated by 谷歌翻译
大型和性能的神经网络通常过度参数化,并且由于修剪而可以大大降低大小和复杂性。修剪是一组方法,它试图消除网络中的冗余或不必要的权重或权重。这些技术允许创建轻型网络,这对于嵌入式或移动应用程序特别重要。在本文中,我们设计了一种替代修剪方法,允许从较大未训练的方法中提取有效的子网。我们的方法是随机的,并通过探索使用Gumbel SoftMax采样的不同拓扑来提取子网。后者还用于训练概率分布,以衡量样品中权重的相关性。使用高效的重新恢复机制进一步增强了最终的子网,从而减少训练时间并提高性能。在CIFAR上进行的广泛实验表明,针对相关工作,我们的子网络提取方法的表现要优于表现。
translated by 谷歌翻译
Multi-view data containing complementary and consensus information can facilitate representation learning by exploiting the intact integration of multi-view features. Because most objects in real world often have underlying connections, organizing multi-view data as heterogeneous graphs is beneficial to extracting latent information among different objects. Due to the powerful capability to gather information of neighborhood nodes, in this paper, we apply Graph Convolutional Network (GCN) to cope with heterogeneous-graph data originating from multi-view data, which is still under-explored in the field of GCN. In order to improve the quality of network topology and alleviate the interference of noises yielded by graph fusion, some methods undertake sorting operations before the graph convolution procedure. These GCN-based methods generally sort and select the most confident neighborhood nodes for each vertex, such as picking the top-k nodes according to pre-defined confidence values. Nonetheless, this is problematic due to the non-differentiable sorting operators and inflexible graph embedding learning, which may result in blocked gradient computations and undesired performance. To cope with these issues, we propose a joint framework dubbed Multi-view Graph Convolutional Network with Differentiable Node Selection (MGCN-DNS), which is constituted of an adaptive graph fusion layer, a graph learning module and a differentiable node selection schema. MGCN-DNS accepts multi-channel graph-structural data as inputs and aims to learn more robust graph fusion through a differentiable neural network. The effectiveness of the proposed method is verified by rigorous comparisons with considerable state-of-the-art approaches in terms of multi-view semi-supervised classification tasks.
translated by 谷歌翻译
Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, substantial research efforts have been devoted to applying deep learning methods to graphs, resulting in beneficial advances in graph analysis techniques. In this survey, we comprehensively review the different types of deep learning methods on graphs. We divide the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks, graph autoencoders, graph reinforcement learning, and graph adversarial methods. We then provide a comprehensive overview of these methods in a systematic manner mainly by following their development history. We also analyze the differences and compositions of different methods. Finally, we briefly outline the applications in which they have been used and discuss potential future research directions.
translated by 谷歌翻译
In skeleton-based action recognition, graph convolutional networks (GCNs), which model the human body skeletons as spatiotemporal graphs, have achieved remarkable performance. However, in existing GCN-based methods, the topology of the graph is set manually, and it is fixed over all layers and input samples. This may not be optimal for the hierarchical GCN and diverse samples in action recognition tasks. In addition, the second-order information (the lengths and directions of bones) of the skeleton data, which is naturally more informative and discriminative for action recognition, is rarely investigated in existing methods. In this work, we propose a novel two-stream adaptive graph convolutional network (2s-AGCN) for skeletonbased action recognition. The topology of the graph in our model can be either uniformly or individually learned by the BP algorithm in an end-to-end manner. This data-driven method increases the flexibility of the model for graph construction and brings more generality to adapt to various data samples. Moreover, a two-stream framework is proposed to model both the first-order and the second-order information simultaneously, which shows notable improvement for the recognition accuracy. Extensive experiments on the two large-scale datasets, NTU-RGBD and Kinetics-Skeleton, demonstrate that the performance of our model exceeds the state-of-the-art with a significant margin.
translated by 谷歌翻译
本文提出了一种新的图形卷积运算符,称为中央差异图卷积(CDGC),用于基于骨架的动作识别。它不仅能够聚合节点信息,如vanilla图卷积操作,而且还可以介绍梯度信息。在不引入任何其他参数的情况下,CDGC可以在任何现有的图形卷积网络(GCN)中取代VANILLA图表卷积。此外,开发了一种加速版的CDGC,这大大提高了培训速度。两个流行的大型数据集NTU RGB + D 60和120的实验表明了所提出的CDGC的功效。代码可在https://github.com/iesymiao/cd-gcn获得。
translated by 谷歌翻译
Outstanding achievements of graph neural networks for spatiotemporal time series analysis show that relational constraints introduce an effective inductive bias into neural forecasting architectures. Often, however, the relational information characterizing the underlying data-generating process is unavailable and the practitioner is left with the problem of inferring from data which relational graph to use in the subsequent processing stages. We propose novel, principled - yet practical - probabilistic score-based methods that learn the relational dependencies as distributions over graphs while maximizing end-to-end the performance at task. The proposed graph learning framework is based on consolidated variance reduction techniques for Monte Carlo score-based gradient estimation, is theoretically grounded, and, as we show, effective in practice. In this paper, we focus on the time series forecasting problem and show that, by tailoring the gradient estimators to the graph learning problem, we are able to achieve state-of-the-art performance while controlling the sparsity of the learned graph and the computational scalability. We empirically assess the effectiveness of the proposed method on synthetic and real-world benchmarks, showing that the proposed solution can be used as a stand-alone graph identification procedure as well as a graph learning component of an end-to-end forecasting architecture.
translated by 谷歌翻译
基于光谱的图形神经网络(SGNNS)在图表表示学习中一直吸引了不断的关注。然而,现有的SGNN是限于实现具有刚性变换的曲线滤波器(例如,曲线图傅立叶或预定义的曲线波小波变换)的限制,并且不能适应驻留在手中的图形和任务上的信号。在本文中,我们提出了一种新颖的图形神经网络,实现了具有自适应图小波的曲线图滤波器。具体地,自适应图表小波通过神经网络参数化提升结构学习,其中开发了基于结构感知的提升操作(即,预测和更新操作)以共同考虑图形结构和节点特征。我们建议基于扩散小波提升以缓解通过分区非二分类图引起的结构信息损失。通过设计,得到了所得小波变换的局部和稀疏性以及提升结构的可扩展性。我们进一步通过在学习的小波中学习稀疏图表表示来引导软阈值滤波操作,从而产生局部,高效和可伸缩的基于小波的图形滤波器。为了确保学习的图形表示不变于节点排列,在网络的输入中采用层以根据其本地拓扑信息重新排序节点。我们在基准引用和生物信息图形数据集中评估节点级和图形级别表示学习任务的所提出的网络。大量实验在准确性,效率和可扩展性方面展示了在现有的SGNN上的所提出的网络的优越性。
translated by 谷歌翻译