Existing popular methods for semi-supervised learning with Graph Neural Networks (such as the Graph Convolutional Network) provably cannot learn a general class of neighborhood mixing relationships. To address this weakness, we propose a new model, MixHop, that can learn these relationships, including difference operators, by repeatedly mixing feature representations of neighbors at various distances. MixHop requires no additional memory or computational complexity, and outperforms on challenging baselines. In addition, we propose sparsity regularization that allows us to visualize how the network prioritizes neighborhood information across different graph datasets. Our analysis of the learned architectures reveals that neighborhood mixing varies per datasets. 1 We use "like", as graph edges are not axis-aligned.
translated by 谷歌翻译
Graph convolutional networks (GCNs) are a powerful deep learning approach for graph-structured data. Recently, GCNs and subsequent variants have shown superior performance in various application areas on real-world datasets. Despite their success, most of the current GCN models are shallow, due to the over-smoothing problem.In this paper, we study the problem of designing and analyzing deep graph convolutional networks. We propose the GCNII, an extension of the vanilla GCN model with two simple yet effective techniques: Initial residual and Identity mapping. We provide theoretical and empirical evidence that the two techniques effectively relieves the problem of over-smoothing. Our experiments show that the deep GCNII model outperforms the state-of-the-art methods on various semi-and fullsupervised tasks. Code is available at https: //github.com/chennnM/GCNII.
translated by 谷歌翻译
图形卷积是一种最近可扩展的方法,用于通过在多个层上汇总本地节点信息来对属性图进行深度特征学习。这样的层仅考虑向前模型中节点邻居的属性信息,并且不将全球网络结构的知识纳入学习任务。特别是,模块化功能提供了有关网络社区结构的方便信息。在这项工作中,我们通过将网络的社区结构保存目标纳入图卷积模型中,调查了对学习表示的质量的影响。我们通过在输出层中的成本函数中的明确正规化项和通过辅助层计算的附加损失项中通过两种方式结合目标。我们报告了在图形卷积体系结构中保存术语的社区结构的效果。对两个归因的分布图网络进行的实验评估表明,社区保护目标的合并提高了稀疏标签制度中的半监督节点分类精度。
translated by 谷歌翻译
We investigate the representation power of graph neural networks in the semisupervised node classification task under heterophily or low homophily, i.e., in networks where connected nodes may have different class labels and dissimilar features. Many popular GNNs fail to generalize to this setting, and are even outperformed by models that ignore the graph structure (e.g., multilayer perceptrons). Motivated by this limitation, we identify a set of key designs-ego-and neighbor-embedding separation, higher-order neighborhoods, and combination of intermediate representations-that boost learning from the graph structure under heterophily. We combine them into a graph neural network, H 2 GCN, which we use as the base method to empirically evaluate the effectiveness of the identified designs. Going beyond the traditional benchmarks with strong homophily, our empirical analysis shows that the identified designs increase the accuracy of GNNs by up to 40% and 27% over models without them on synthetic and real networks with heterophily, respectively, and yield competitive performance under homophily.
translated by 谷歌翻译
图形神经网络已成为从图形结构数据学习的不可缺少的工具之一,并且它们的实用性已在各种各样的任务中显示。近年来,建筑设计的巨大改进,导致各种预测任务的性能更好。通常,这些神经架构在同一层中使用可知的权重矩阵组合节点特征聚合和特征转换。这使得分析从各种跳过的节点特征和神经网络层的富有效力来挑战。由于不同的图形数据集显示在特征和类标签分布中的不同级别和异常级别,因此必须了解哪些特征对于没有任何先前信息的预测任务是重要的。在这项工作中,我们将节点特征聚合步骤和深度与图形神经网络分离,并经验分析了不同的聚合特征在预测性能中发挥作用。我们表明,并非通过聚合步骤生成的所有功能都很有用,并且通常使用这些较少的信息特征可能对GNN模型的性能有害。通过我们的实验,我们表明学习这些功能的某些子集可能会导致各种数据集的性能更好。我们建议使用Softmax作为常规器,并从不同跳距的邻居聚合的功能的“软选择器”;和L2 - GNN层的标准化。结合这些技术,我们呈现了一个简单浅的模型,特征选择图神经网络(FSGNN),并经验展示所提出的模型比九个基准数据集中的最先进的GNN模型实现了可比或甚至更高的准确性节点分类任务,具有显着的改进,可达51.1%。
translated by 谷歌翻译
图表学习目的旨在将节点内容与图形结构集成以学习节点/图表示。然而,发现许多现有的图形学习方法在具有高异性级别的数据上不能很好地工作,这是不同类标签之间很大比例的边缘。解决这个问题的最新努力集中在改善消息传递机制上。但是,尚不清楚异质性是否确实会损害图神经网络(GNNS)的性能。关键是要展现一个节点与其直接邻居之间的关系,例如它们是异性还是同质性?从这个角度来看,我们在这里研究了杂质表示在披露连接节点之间的关系之前/之后的杂音表示的作用。特别是,我们提出了一个端到端框架,该框架既学习边缘的类型(即异性/同质性),并利用边缘类型的信息来提高图形神经网络的表现力。我们以两种不同的方式实施此框架。具体而言,为了避免通过异质边缘传递的消息,我们可以通过删除边缘分类器鉴定的异性边缘来优化图形结构。另外,可以利用有关异性邻居的存在的信息进行特征学习,因此,设计了一种混合消息传递方法来汇总同质性邻居,并根据边缘分类使异性邻居多样化。广泛的实验表明,在整个同质级别的多个数据集上,通过在多个数据集上提出的框架对GNN的绩效提高了显着提高。
translated by 谷歌翻译
我们研究了深GCN模型中的自适应层图形卷积。我们建议ADAGPR在GCNII网络的每一层中学习通用的Pageranks,以诱导适应性卷积。我们表明,ADAGPR结合的概括是由归一化邻接矩阵的特征值谱的多项式按概括性Pagerank系数数量的顺序界定的。通过分析概括范围,我们表明过度厚度取决于汇总的较高阶段矩阵矩阵和模型深度。我们使用基准真实数据对节点分类进行了评估,并表明ADAGPR与现有的图形卷积网络相比提供了改进的精确度,同时证明了针对超平面的稳健性。此外,我们证明了对层概括的PageRanks系数的分析使我们能够在每个层上定性地了解模型解释的卷积。
translated by 谷歌翻译
Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, may inherit unnecessary complexity and redundant computation. In this paper, we reduce this excess complexity through successively removing nonlinearities and collapsing weight matrices between consecutive layers. We theoretically analyze the resulting linear model and show that it corresponds to a fixed low-pass filter followed by a linear classifier. Notably, our experimental evaluation demonstrates that these simplifications do not negatively impact accuracy in many downstream applications. Moreover, the resulting model scales to larger datasets, is naturally interpretable, and yields up to two orders of magnitude speedup over FastGCN.
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
图表可以模拟实体之间的复杂交互,它在许多重要的应用程序中自然出现。这些应用程序通常可以投入到标准图形学习任务中,其中关键步骤是学习低维图表示。图形神经网络(GNN)目前是嵌入方法中最受欢迎的模型。然而,邻域聚合范例中的标准GNN患有区分\ EMPH {高阶}图形结构的有限辨别力,而不是\ EMPH {低位}结构。为了捕获高阶结构,研究人员求助于主题和开发的基于主题的GNN。然而,现有的基于主基的GNN仍然仍然遭受较少的辨别力的高阶结构。为了克服上述局限性,我们提出了一个新颖的框架,以更好地捕获高阶结构的新框架,铰接于我们所提出的主题冗余最小化操作员和注射主题组合的新颖框架。首先,MGNN生成一组节点表示W.R.T.每个主题。下一阶段是我们在图案中提出的冗余最小化,该主题在彼此相互比较并蒸馏出每个主题的特征。最后,MGNN通过组合来自不同图案的多个表示来执行节点表示的更新。特别地,为了增强鉴别的功率,MGNN利用重新注射功能来组合表示的函数w.r.t.不同的主题。我们进一步表明,我们的拟议体系结构增加了GNN的表现力,具有理论分析。我们展示了MGNN在节点分类和图形分类任务上的七个公共基准上表现出最先进的方法。
translated by 谷歌翻译
Graph neural networks (GNNs) have been widely used under semi-supervised settings. Prior studies have mainly focused on finding appropriate graph filters (e.g., aggregation schemes) to generalize well for both homophilic and heterophilic graphs. Even though these approaches are essential and effective, they still suffer from the sparsity in initial node features inherent in the bag-of-words representation. Common in semi-supervised learning where the training samples often fail to cover the entire dimensions of graph filters (hyperplanes), this can precipitate over-fitting of specific dimensions in the first projection matrix. To deal with this problem, we suggest a simple and novel strategy; create additional space by flipping the initial features and hyperplane simultaneously. Training in both the original and in the flip space can provide precise updates of learnable parameters. To the best of our knowledge, this is the first attempt that effectively moderates the overfitting problem in GNN. Extensive experiments on real-world datasets demonstrate that the proposed technique improves the node classification accuracy up to 40.2 %
translated by 谷歌翻译
A prominent paradigm for graph neural networks is based on the message passing framework. In this framework, information communication is realized only between neighboring nodes. The challenge of approaches that use this paradigm is to ensure efficient and accurate \textit{long distance communication} between nodes, as deep convolutional networks are prone to over-smoothing. In this paper, we present a novel method based on time derivative graph diffusion (TIDE), with a learnable time parameter. Our approach allows to adapt the spatial extent of diffusion across different tasks and network channels, thus enabling medium and long-distance communication efficiently. Furthermore, we show that our architecture directly enables local message passing and thus inherits from the expressive power of local message passing approaches. We show that on widely used graph benchmarks we achieve comparable performance and on a synthetic mesh dataset we outperform state-of-the-art methods like GCN or GRAND by a significant margin.
translated by 谷歌翻译
Pre-publication draft of a book to be published byMorgan & Claypool publishers. Unedited version released with permission. All relevant copyrights held by the author and publisher extend to this pre-publication draft.
translated by 谷歌翻译
近年来,图表表示学习越来越多地引起了越来越长的关注,特别是为了在节点和图表水平上学习对分类和建议任务的低维嵌入。为了能够在现实世界中的大规模图形数据上学习表示,许多研究专注于开发不同的抽样策略,以方便培训过程。这里,我们提出了一种自适应图策略驱动的采样模型(GPS),其中通过自适应相关计算实现了本地邻域中每个节点的影响。具体地,邻居的选择是由自适应策略算法指导的,直接贡献到消息聚合,节点嵌入更新和图级读出步骤。然后,我们从各种角度对图表分类任务进行全面的实验。我们所提出的模型在几个重要的基准测试中优于现有的3%-8%,实现了现实世界数据集的最先进的性能。
translated by 谷歌翻译
Graph neural networks have shown significant success in the field of graph representation learning. Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations. Nevertheless, one layer of these neighborhood aggregation methods only consider immediate neighbors, and the performance decreases when going deeper to enable larger receptive fields. Several recent studies attribute this performance deterioration to the over-smoothing issue, which states that repeated propagation makes node representations of different classes indistinguishable. In this work, we study this observation systematically and develop new insights towards deeper graph neural networks. First, we provide a systematical analysis on this issue and argue that the key factor compromising the performance significantly is the entanglement of representation transformation and propagation in current graph convolution operations. After decoupling these two operations, deeper graph neural networks can be used to learn graph node representations from larger receptive fields. We further provide a theoretical analysis of the above observation when building very deep models, which can serve as a rigorous and gentle description of the over-smoothing issue. Based on our theoretical and empirical analysis, we propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields. A set of experiments on citation, coauthorship, and co-purchase datasets have confirmed our analysis and insights and demonstrated the superiority of our proposed methods. CCS CONCEPTS• Mathematics of computing → Graph algorithms; • Computing methodologies → Artificial intelligence; Neural networks.
translated by 谷歌翻译
我们研究了从高阶图卷积中的有效学习,并直接从邻接矩阵进行节点分类学习。我们重新访问缩放的图形残留网络,并从残留层中删除Relu激活,并在每个残留层上应用一个重量矩阵。我们表明,所得模型导致新的图卷积模型作为归一化邻接矩阵,残留权重矩阵和残差缩放参数的多项式。此外,我们提出了直接绘制多项式卷积模型和直接从邻接矩阵学习的自适应学习。此外,我们提出了完全自适应模型,以学习每个残留层的缩放参数。我们表明,所提出的方法的概括界限是特征值谱,缩放参数和残留权重的上限的多项式。通过理论分析,我们认为所提出的模型可以通过限制卷积的更高端口和直接从邻接矩阵学习来获得改进的概括界限。我们使用一套真实数据,我们证明所提出的方法获得了提高的非全粒图淋巴结分类的精度。
translated by 谷歌翻译
Advanced methods of applying deep learning to structured data such as graphs have been proposed in recent years. In particular, studies have focused on generalizing convolutional neural networks to graph data, which includes redefining the convolution and the downsampling (pooling) operations for graphs. The method of generalizing the convolution operation to graphs has been proven to improve performance and is widely used. However, the method of applying downsampling to graphs is still difficult to perform and has room for improvement. In this paper, we propose a graph pooling method based on selfattention. Self-attention using graph convolution allows our pooling method to consider both node features and graph topology. To ensure a fair comparison, the same training procedures and model architectures were used for the existing pooling methods and our method. The experimental results demonstrate that our method achieves superior graph classification performance on the benchmark datasets using a reasonable number of parameters.
translated by 谷歌翻译
图形卷积网络对于从图形结构数据进行深入学习而变得必不可少。大多数现有的图形卷积网络都有两个大缺点。首先,它们本质上是低通滤波器,因此忽略了图形信号的潜在有用的中和高频带。其次,固定了现有图卷积过滤器的带宽。图形卷积过滤器的参数仅转换图输入而不更改图形卷积滤波器函数的曲率。实际上,除非我们有专家领域知识,否则我们不确定是否应该在某个点保留或切断频率。在本文中,我们建议自动图形卷积网络(AUTOGCN)捕获图形信号的完整范围,并自动更新图形卷积过滤器的带宽。虽然它基于图谱理论,但我们的自动环境也位于空间中,并具有空间形式。实验结果表明,AutoGCN比仅充当低通滤波器的基线方法实现了显着改善。
translated by 谷歌翻译
消息传递已作为设计图形神经网络(GNN)的有效工具的发展。但是,消息传递的大多数现有方法简单地简单或平均所有相邻的功能更新节点表示。它们受到两个问题的限制,即(i)缺乏可解释性来识别对GNN的预测重要的节点特征,以及(ii)特征过度混合,导致捕获长期依赖和无能为力的过度平滑问题在异质或低同质的下方处理图。在本文中,我们提出了一个节点级胶囊图神经网络(NCGNN),以通过改进的消息传递方案来解决这些问题。具体而言,NCGNN表示节点为节点级胶囊组,其中每个胶囊都提取其相应节点的独特特征。对于每个节点级胶囊,开发了一个新颖的动态路由过程,以适应适当的胶囊,以从设计的图形滤波器确定的子图中聚集。 NCGNN聚集仅有利的胶囊并限制无关的消息,以避免交互节点的过度混合特征。因此,它可以缓解过度平滑的问题,并通过同粒或异质的图表学习有效的节点表示。此外,我们提出的消息传递方案本质上是可解释的,并免于复杂的事后解释,因为图形过滤器和动态路由过程确定了节点特征的子集,这对于从提取的子分类中的模型预测最为重要。关于合成和现实图形的广泛实验表明,NCGNN可以很好地解决过度光滑的问题,并为半监视的节点分类产生更好的节点表示。它的表现优于同质和异质的艺术状态。
translated by 谷歌翻译
Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.
translated by 谷歌翻译