我们介绍了de bruijn图神经网络(DBGNNS),这是一种新颖的时间感知图神经网络体系结构,用于动态图上的时间分辨数据。我们的方法解释了动态图的因果拓扑中展开的时间流行模式,该模式由因果步行确定,即节点可以随着时间的时间影响彼此的链接序列。我们的架构建立在多层de bruijn图的多层上,这是一个迭代的线图结构,其中d de bruijn图中的节点k表示长度k-1的步行,而边缘则表示长度k的步行。我们开发了一个图形神经网络体系结构,该架构利用de bruijn图来实现遵循非马克维亚动力学的消息传递方案,该方案使我们能够在动态图的因果拓扑中学习模式。解决de bruijn图形不同订单k的问题可用于建模相同的数据集,我们进一步应用统计模型选择以确定用于消息传递的最佳图形拓扑。合成和经验数据集的评估表明,DBGNN可以利用动态图中的时间模式,从而大大改善了监督节点分类任务中的性能。
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
时间图代表实体之间的动态关系,并发生在许多现实生活中的应用中,例如社交网络,电子商务,通信,道路网络,生物系统等。他们需要根据其生成建模和表示学习的研究超出与静态图有关的研究。在这项调查中,我们全面回顾了近期针对处理时间图提出的神经时间依赖图表的学习和生成建模方法。最后,我们确定了现有方法的弱点,并讨论了我们最近发表的论文提格的研究建议[24]。
translated by 谷歌翻译
Graphs are ubiquitous in nature and can therefore serve as models for many practical but also theoretical problems. For this purpose, they can be defined as many different types which suitably reflect the individual contexts of the represented problem. To address cutting-edge problems based on graph data, the research field of Graph Neural Networks (GNNs) has emerged. Despite the field's youth and the speed at which new models are developed, many recent surveys have been published to keep track of them. Nevertheless, it has not yet been gathered which GNN can process what kind of graph types. In this survey, we give a detailed overview of already existing GNNs and, unlike previous surveys, categorize them according to their ability to handle different graph types and properties. We consider GNNs operating on static and dynamic graphs of different structural constitutions, with or without node or edge attributes. Moreover, we distinguish between GNN models for discrete-time or continuous-time dynamic graphs and group the models according to their architecture. We find that there are still graph types that are not or only rarely covered by existing GNN models. We point out where models are missing and give potential reasons for their absence.
translated by 谷歌翻译
异质图卷积网络在解决异质网络数据的各种网络分析任务方面已广受欢迎,从链接预测到节点分类。但是,大多数现有作品都忽略了多型节点之间的多重网络的关系异质性,而在元路径中,元素嵌入中关系的重要性不同,这几乎无法捕获不同关系跨不同关系的异质结构信号。为了应对这一挑战,这项工作提出了用于异质网络嵌入的多重异质图卷积网络(MHGCN)。我们的MHGCN可以通过多层卷积聚合自动学习多重异质网络中不同长度的有用的异质元路径相互作用。此外,我们有效地将多相关结构信号和属性语义集成到学习的节点嵌入中,并具有无监督和精选的学习范式。在具有各种网络分析任务的五个现实世界数据集上进行的广泛实验表明,根据所有评估指标,MHGCN与最先进的嵌入基线的优势。
translated by 谷歌翻译
生物医学网络是与疾病网络的蛋白质相互作用的普遍描述符,从蛋白质相互作用,一直到医疗保健系统和科学知识。随着代表学习提供强大的预测和洞察的显着成功,我们目睹了表现形式学习技术的快速扩展,进入了这些网络的建模,分析和学习。在这篇综述中,我们提出了一个观察到生物学和医学中的网络长期原则 - 而在机器学习研究中经常出口 - 可以为代表学习提供概念基础,解释其当前的成功和限制,并告知未来进步。我们综合了一系列算法方法,即在其核心利用图形拓扑到将网络嵌入到紧凑的向量空间中,并捕获表示陈述学习证明有用的方式的广度。深远的影响包括鉴定复杂性状的变异性,单细胞的异心行为及其对健康的影响,协助患者的诊断和治疗以及制定安全有效的药物。
translated by 谷歌翻译
Pre-publication draft of a book to be published byMorgan & Claypool publishers. Unedited version released with permission. All relevant copyrights held by the author and publisher extend to this pre-publication draft.
translated by 谷歌翻译
异质图具有多个节点和边缘类型,并且在语义上比同质图更丰富。为了学习这种复杂的语义,许多用于异质图的图形神经网络方法使用Metapaths捕获节点之间的多跳相互作用。通常,非目标节点的功能未纳入学习过程。但是,可以存在涉及多个节点或边缘的非线性高阶相互作用。在本文中,我们提出了Simplicial Graph注意网络(SGAT),这是一种简单的复杂方法,可以通过将非目标节点的特征放在简单上来表示这种高阶相互作用。然后,我们使用注意机制和上邻接来生成表示。我们凭经验证明了方法在异质图数据集上使用节点分类任务的方法的功效,并进一步显示了SGAT通过采用随机节点特征来提取结构信息的能力。数值实验表明,SGAT的性能优于其他当前最新的异质图学习方法。
translated by 谷歌翻译
Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.
translated by 谷歌翻译
Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.
translated by 谷歌翻译
图表表示学习是一种快速增长的领域,其中一个主要目标是在低维空间中产生有意义的图形表示。已经成功地应用了学习的嵌入式来执行各种预测任务,例如链路预测,节点分类,群集和可视化。图表社区的集体努力提供了数百种方法,但在所有评估指标下没有单一方法擅长,例如预测准确性,运行时间,可扩展性等。该调查旨在通过考虑算法来评估嵌入方法的所有主要类别的图表变体,参数选择,可伸缩性,硬件和软件平台,下游ML任务和多样化数据集。我们使用包含手动特征工程,矩阵分解,浅神经网络和深图卷积网络的分类法组织了图形嵌入技术。我们使用广泛使用的基准图表评估了节点分类,链路预测,群集和可视化任务的这些类别算法。我们在Pytorch几何和DGL库上设计了我们的实验,并在不同的多核CPU和GPU平台上运行实验。我们严格地审查了各种性能指标下嵌入方法的性能,并总结了结果。因此,本文可以作为比较指南,以帮助用户选择最适合其任务的方法。
translated by 谷歌翻译
图表神经网络(GNNS)在各种机器学习任务中获得了表示学习的提高。然而,应用邻域聚合的大多数现有GNN通常在图中的图表上执行不良,其中相邻的节点属于不同的类。在本文中,我们示出了在典型的异界图中,边缘可以被引导,以及是否像是处理边缘,也可以使它们过度地影响到GNN模型的性能。此外,由于异常的限制,节点对来自本地邻域之外的类似节点的消息非常有益。这些激励我们开发一个自适应地学习图表的方向性的模型,并利用潜在的长距离相关性节点之间。我们首先将图拉普拉斯概括为基于所提出的特征感知PageRank算法向数字化,该算法同时考虑节点之间的图形方向性和长距离特征相似性。然后,Digraph Laplacian定义了一个图形传播矩阵,导致一个名为{\ em diglaciangcn}的模型。基于此,我们进一步利用节点之间的通勤时间测量的节点接近度,以便在拓扑级别上保留节点的远距离相关性。具有不同级别的10个数据集的广泛实验,同意级别展示了我们在节点分类任务任务中对现有解决方案的有效性。
translated by 谷歌翻译
图形神经网络(GNNS)依赖于图形结构来定义聚合策略,其中每个节点通过与邻居的信息组合来更新其表示。已知GNN的限制是,随着层数的增加,信息被平滑,压扁并且节点嵌入式变得无法区分,对性能产生负面影响。因此,实用的GNN模型雇用了几层,只能在每个节点周围的有限邻域利用图形结构。不可避免地,实际的GNN不会根据图的全局结构捕获信息。虽然有几种研究GNNS的局限性和表达性,但是关于图形结构数据的实际应用的问题需要全局结构知识,仍然没有答案。在这项工作中,我们通过向几个GNN模型提供全球信息并观察其对下游性能的影响来认证解决这个问题。我们的研究结果表明,全球信息实际上可以为共同的图形相关任务提供显着的好处。我们进一步确定了一项新的正规化策略,导致所有考虑的任务的平均准确性提高超过5%。
translated by 谷歌翻译
复杂网络分析的最新进展为不同领域的应用开辟了广泛的可能性。网络分析的功能取决于节点特征。基于拓扑的节点特征是对局部和全局空间关系和节点连接结构的实现。因此,收集有关节点特征的正确信息和相邻节点的连接结构在复杂网络分析中在节点分类和链接预测中起着最突出的作用。目前的工作介绍了一种新的特征抽象方法,即基于嵌入匿名随机步行向量上的匿名随机步行,即过渡概率矩阵(TPM)。节点特征向量由从预定义半径中的一组步行中获得的过渡概率组成。过渡概率与局部连接结构直接相关,因此正确嵌入到特征向量上。在节点识别/分类中测试了建议的嵌入方法的成功,并在三个常用的现实世界网络上进行了链接预测。在现实世界网络中,具有相似连接结构的节点很常见。因此,从类似网络中获取新网络预测的信息是一种显着特征,它使所提出的算法在跨网络概括任务方面优于最先进的算法。
translated by 谷歌翻译
Node classification for graph-structured data aims to classify nodes whose labels are unknown. While studies on static graphs are prevalent, few studies have focused on dynamic graph node classification. Node classification on dynamic graphs is challenging for two reasons. First, the model needs to capture both structural and temporal information, particularly on dynamic graphs with a long history and require large receptive fields. Second, model scalability becomes a significant concern as the size of the dynamic graph increases. To address these problems, we propose the Time Augmented Dynamic Graph Neural Network (TADGNN) framework. TADGNN consists of two modules: 1) a time augmentation module that captures the temporal evolution of nodes across time structurally, creating a time-augmented spatio-temporal graph, and 2) an information propagation module that learns the dynamic representations for each node across time using the constructed time-augmented graph. We perform node classification experiments on four dynamic graph benchmarks. Experimental results demonstrate that TADGNN framework outperforms several static and dynamic state-of-the-art (SOTA) GNN models while demonstrating superior scalability. We also conduct theoretical and empirical analyses to validate the efficiency of the proposed method. Our code is available at https://sites.google.com/view/tadgnn.
translated by 谷歌翻译
单纯性神经网络(SNN)最近被出现为图表学习中最新方向,这扩大了从节点空间到图形上的单纯复合体的卷积体系结构的想法。在目前的实践中,单纯复合资源允许我们描述高阶交互和多节点图结构的节点中的节点之间的成对关系进行预先定位通过在卷积操作和新块Hodge-Laplacian之间建立连接时,我们提出了第一个用于链接预测的SNN。我们的新块单纯性复杂神经网络(BSCNET)模型通过系统地掺入不同尺寸的多个高阶图结构之间的突出相互作用来推广现有的图形卷积网络(GCN)框架。我们讨论BSCNET背后的理论基础,并说明了其在八个现实世界和合成数据集上的链接预测的实用性。我们的实验表明,BSCNETS在保持低计算成本的同时优于最先进的模型,同时保持最高的余量。最后,我们展示了BSCnets作为追踪Covid-19等传染病传播的新有前途的替代品,并测量医疗保障风险缓解策略的有效性。
translated by 谷歌翻译
在过去的二十年中,我们目睹了以图形或网络形式构建的有价值的大数据的大幅增长。为了将传统的机器学习和数据分析技术应用于此类数据,有必要将图形转换为基于矢量的表示,以保留图形最重要的结构属性。为此,文献中已经提出了大量的图形嵌入方法。它们中的大多数产生了适用于各种应用的通用嵌入,例如节点聚类,节点分类,图形可视化和链接预测。在本文中,我们提出了两个新的图形嵌入算法,这些算法是基于专门为节点分类问题设计的随机步道。已设计算法的随机步行采样策略旨在特别注意集线器 - 高度节点,这些节点在大规模图中具有最关键的作用。通过分析对现实世界网络嵌入的三种分类算法的分类性能,对所提出的方法进行实验评估。获得的结果表明,与当前最流行的随机步行方法相比,我们的方法可大大提高所检查分类器的预测能力(NODE2VEC)。
translated by 谷歌翻译
图表可以模拟实体之间的复杂交互,它在许多重要的应用程序中自然出现。这些应用程序通常可以投入到标准图形学习任务中,其中关键步骤是学习低维图表示。图形神经网络(GNN)目前是嵌入方法中最受欢迎的模型。然而,邻域聚合范例中的标准GNN患有区分\ EMPH {高阶}图形结构的有限辨别力,而不是\ EMPH {低位}结构。为了捕获高阶结构,研究人员求助于主题和开发的基于主题的GNN。然而,现有的基于主基的GNN仍然仍然遭受较少的辨别力的高阶结构。为了克服上述局限性,我们提出了一个新颖的框架,以更好地捕获高阶结构的新框架,铰接于我们所提出的主题冗余最小化操作员和注射主题组合的新颖框架。首先,MGNN生成一组节点表示W.R.T.每个主题。下一阶段是我们在图案中提出的冗余最小化,该主题在彼此相互比较并蒸馏出每个主题的特征。最后,MGNN通过组合来自不同图案的多个表示来执行节点表示的更新。特别地,为了增强鉴别的功率,MGNN利用重新注射功能来组合表示的函数w.r.t.不同的主题。我们进一步表明,我们的拟议体系结构增加了GNN的表现力,具有理论分析。我们展示了MGNN在节点分类和图形分类任务上的七个公共基准上表现出最先进的方法。
translated by 谷歌翻译
Clustering is a fundamental problem in network analysis that finds closely connected groups of nodes and separates them from other nodes in the graph, while link prediction is to predict whether two nodes in a network are likely to have a link. The definition of both naturally determines that clustering must play a positive role in obtaining accurate link prediction tasks. Yet researchers have long ignored or used inappropriate ways to undermine this positive relationship. In this article, We construct a simple but efficient clustering-driven link prediction framework(ClusterLP), with the goal of directly exploiting the cluster structures to obtain connections between nodes as accurately as possible in both undirected graphs and directed graphs. Specifically, we propose that it is easier to establish links between nodes with similar representation vectors and cluster tendencies in undirected graphs, while nodes in a directed graphs can more easily point to nodes similar to their representation vectors and have greater influence in their own cluster. We customized the implementation of ClusterLP for undirected and directed graphs, respectively, and the experimental results using multiple real-world networks on the link prediction task showed that our models is highly competitive with existing baseline models. The code implementation of ClusterLP and baselines we use are available at https://github.com/ZINUX1998/ClusterLP.
translated by 谷歌翻译
Recently, graph neural networks (GNNs) have revolutionized the field of graph representation learning through effectively learned node embeddings, and achieved state-of-the-art results in tasks such as node classification and link prediction. However, current GNN methods are inherently flat and do not learn hierarchical representations of graphs-a limitation that is especially problematic for the task of graph classification, where the goal is to predict the label associated with an entire graph. Here we propose DIFFPOOL, a differentiable graph pooling module that can generate hierarchical representations of graphs and can be combined with various graph neural network architectures in an end-to-end fashion. DIFFPOOL learns a differentiable soft cluster assignment for nodes at each layer of a deep GNN, mapping nodes to a set of clusters, which then form the coarsened input for the next GNN layer. Our experimental results show that combining existing GNN methods with DIFFPOOL yields an average improvement of 5-10% accuracy on graph classification benchmarks, compared to all existing pooling approaches, achieving a new state-of-the-art on four out of five benchmark data sets.
translated by 谷歌翻译