尽管与以太坊这样的加密货币交易变得越来越普遍,但欺诈和其他犯罪交易并不少见。图分析算法和机器学习技术检测到导致大型交易网络网络钓鱼的可疑交易。已经提出了许多图形神经网络(GNN)模型将深度学习技术应用于图形结构。尽管在以太坊交易网络中使用GNN模型进行了网络钓鱼检测的研究,但尚未研究针对顶点和边缘数量的规模以及标签不平衡的模型。在本文中,我们比较了GNN模型在实际以太坊交易网络数据集和网络钓鱼报告的标签数据上的模型性能,以详尽地比较和验证哪些GNN模型和超参数产生最佳精度。具体而言,我们评估了代表性同质GNN模型的模型性能,该模型考虑了单型节点和边缘以及支持不同类型的节点和边缘的异质GNN模型。我们表明,异质模型比同质模型具有更好的模型性能。特别是,RGCN模型在整体指标中取得了最佳性能。
translated by 谷歌翻译
异质图卷积网络在解决异质网络数据的各种网络分析任务方面已广受欢迎,从链接预测到节点分类。但是,大多数现有作品都忽略了多型节点之间的多重网络的关系异质性,而在元路径中,元素嵌入中关系的重要性不同,这几乎无法捕获不同关系跨不同关系的异质结构信号。为了应对这一挑战,这项工作提出了用于异质网络嵌入的多重异质图卷积网络(MHGCN)。我们的MHGCN可以通过多层卷积聚合自动学习多重异质网络中不同长度的有用的异质元路径相互作用。此外,我们有效地将多相关结构信号和属性语义集成到学习的节点嵌入中,并具有无监督和精选的学习范式。在具有各种网络分析任务的五个现实世界数据集上进行的广泛实验表明,根据所有评估指标,MHGCN与最先进的嵌入基线的优势。
translated by 谷歌翻译
近年来,异构图形神经网络(HGNNS)一直在开花,但每个工作所使用的独特数据处理和评估设置会让他们的进步完全了解。在这项工作中,我们通过使用其官方代码,数据集,设置和超参数来展示12个最近的HGNN的系统再现,揭示了关于HGNN的进展的令人惊讶的结果。我们发现,由于设置不当,简单的均匀GNN,例如GCN和GAT在很大程度上低估了。具有适当输入的GAT通常可以匹配或优于各种场景的所有现有HGNN。为了促进稳健和可重复的HGNN研究,我们构建异构图形基准(HGB),由具有三个任务的11个不同数据集组成。 HGB标准化异构图数据分割,特征处理和性能评估的过程。最后,我们介绍了一个简单但非常强大的基线简单 - HGN - 这显着优于HGB上以前的所有模型 - 以加速未来HGNN的进步。
translated by 谷歌翻译
异质图具有多个节点和边缘类型,并且在语义上比同质图更丰富。为了学习这种复杂的语义,许多用于异质图的图形神经网络方法使用Metapaths捕获节点之间的多跳相互作用。通常,非目标节点的功能未纳入学习过程。但是,可以存在涉及多个节点或边缘的非线性高阶相互作用。在本文中,我们提出了Simplicial Graph注意网络(SGAT),这是一种简单的复杂方法,可以通过将非目标节点的特征放在简单上来表示这种高阶相互作用。然后,我们使用注意机制和上邻接来生成表示。我们凭经验证明了方法在异质图数据集上使用节点分类任务的方法的功效,并进一步显示了SGAT通过采用随机节点特征来提取结构信息的能力。数值实验表明,SGAT的性能优于其他当前最新的异质图学习方法。
translated by 谷歌翻译
许多真实世界图(网络)是具有不同类型的节点和边缘的异构。异构图嵌入,旨在学习异构图的低维节点表示,对于各种下游应用至关重要。已经提出了许多基于元路径的嵌入方法来学习近年来异构图的语义信息。然而,在学习异构图形嵌入时,大多数现有技术都在图形结构信息中忽略了图形结构信息。本文提出了一种新颖的结构意识异构图形神经网络(SHGNN),以解决上述限制。详细地,我们首先利用特征传播模块来捕获元路径中中间节点的本地结构信息。接下来,我们使用树关注聚合器将图形结构信息结合到元路径上的聚合模块中。最后,我们利用了元路径聚合器熔断来自不同元路径的聚合的信息。我们对节点分类和聚类任务进行了实验,并在基准数据集中实现了最先进的结果,该数据集显示了我们所提出的方法的有效性。
translated by 谷歌翻译
在线零售平台,积极检测交易风险至关重要,以提高客户体验,并尽量减少财务损失。在这项工作中,我们提出了一种可解释的欺诈行为预测框架,主要由探测器和解释器组成。 Xfraud探测器可以有效和有效地预测进货交易的合法性。具体地,它利用异构图形神经网络来从事务日志中的信息的非渗透键入实体中学习表达式表示。 Xfraud中的解释器可以从图表中生成有意义和人性化的解释,以便于业务部门中的进一步进程。在我们对具有高达11亿节点和37亿边缘的实际交易网络上的Xfraud实验中,XFraud能够在许多评估度量中倾销各种基线模型,同时在分布式设置中剩余可扩展。此外,我们表明,XFraud解释者可以通过定量和定性评估来显着帮助业务分析来产生合理的解释。
translated by 谷歌翻译
由于图神经网络(GNN)的成功和异质信息网络的广泛应用,近年来,异质图学习近年来引起了极大的关注。已经提出了各种异质图神经网络,以概括GNN来处理异质图。不幸的是,这些方法通过各种复杂的模块对异质性进行建模。本文旨在提出一个简单而有效的框架,以使均质GNN具有足够的处理异质图的能力。具体而言,我们提出了基于关系嵌入的图形神经网络(RE-GNNS),该图形仅使用一个参数来嵌入边缘类型关系和自动连接的重要性。为了同时优化这些关系嵌入和其他参数,提出了一个梯度缩放因子来约束嵌入以收敛到合适的值。此外,我们从理论上证明,与基于元路径的异质GNN相比,我们的RE-GNN具有更高的表现力。关于节点分类任务的广泛实验验证了我们提出的方法的有效性。
translated by 谷歌翻译
Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed for homogeneous graphs, in which all nodes and edges belong to the same types, making them infeasible to represent heterogeneous structures. In this paper, we present the Heterogeneous Graph Transformer (HGT) architecture for modeling Web-scale heterogeneous graphs. To model heterogeneity, we design node-and edge-type dependent parameters to characterize the heterogeneous attention over each edge, empowering HGT to maintain dedicated representations for different types of nodes and edges. To handle dynamic heterogeneous graphs, we introduce the relative temporal encoding technique into HGT, which is able to capture the dynamic structural dependency with arbitrary durations. To handle Web-scale graph data, we design the heterogeneous mini-batch graph sampling algorithm-HGSampling-for efficient and scalable training. Extensive experiments on the Open Academic Graph of 179 million nodes and 2 billion edges show that the proposed HGT model consistently outperforms all the state-of-the-art GNN baselines by 9%-21% on various downstream tasks. The dataset and source code of HGT are publicly available at https://github.com/acbull/pyHGT.
translated by 谷歌翻译
由于学术和工业领域的异质图无处不在,研究人员最近提出了许多异质图神经网络(HGNN)。在本文中,我们不再采用更强大的HGNN模型,而是有兴趣设计一个多功能的插件模块,该模块解释了从预先训练的HGNN中提取的关系知识。据我们所知,我们是第一个在异质图上提出高阶(雇用)知识蒸馏框架的人,无论HGNN的模型体系结构如何,它都可以显着提高预测性能。具体而言,我们的雇用框架最初执行一阶节点级知识蒸馏,该蒸馏曲线及其预测逻辑编码了老师HGNN的语义。同时,二阶关系级知识蒸馏模仿了教师HGNN生成的不同类型的节点嵌入之间的关系相关性。在各种流行的HGNN模型和三个现实世界的异质图上进行了广泛的实验表明,我们的方法获得了一致且相当大的性能增强,证明了其有效性和泛化能力。
translated by 谷歌翻译
多药物(定义为使用多种药物)是一种标准治疗方法,尤其是对于严重和慢性疾病。但是,将多种药物一起使用可能会导致药物之间的相互作用。药物 - 药物相互作用(DDI)是一种与另一种药物结合时的影响发生变化时发生的活性。 DDI可能会阻塞,增加或减少药物的预期作用,或者在最坏情况下,会产生不利的副作用。虽然准时检测DDI至关重要,但由于持续时间短,并且在临床试验中识别它们是时间的,而且昂贵,并且要考虑许多可能的药物对进行测试。结果,需要计算方法来预测DDI。在本文中,我们提出了一种新型的异质图注意模型Han-DDI,以预测药物 - 药物相互作用。我们建立了具有不同生物实体的药物网络。然后,我们开发了一个异质的图形注意网络,以使用药物与其他实体的关系学习DDI。它由一个基于注意力的异质图节点编码器组成,用于获得药物节点表示和用于预测药物相互作用的解码器。此外,我们利用全面的实验来评估我们的模型并将其与最先进的模型进行比较。实验结果表明,我们提出的方法Han-DDI的表现可以显着,准确地预测DDI,即使对于新药也是如此。
translated by 谷歌翻译
近三年来,异质图神经网络(HGNN)吸引了研究的兴趣。大多数现有的HGNN分为两类。一个类是基于元路径的HGNN,要么需要域知识才能手工制作元路径,要么花费大量时间和内存来自动构建元路径。另一个类不依赖元路径结构。它将均匀的卷积图神经网络(Conv-GNN)作为骨架,并通过引入节点型和边缘型依赖性参数将其扩展到异质图。不管元路径依赖性如何,大多数现有的HGNN都采用浅层探测器(例如GCN和GAT)来汇总邻里信息,并且可能有限地捕获高阶邻里信息的能力。在这项工作中,我们提出了两个异构图树网络模型:异质图树卷积网络(HETGTCN)和异质图树注意网络(HETGTAN),它们不依赖元路径来在两个节点特征和图形结构中编码异质性。在三个现实世界的异质图数据上进行了广泛的实验表明,所提出的HETGTCN和HETGTAN具有有效的效率,并且一致地超过了所有最先进的HGNN基准在半监视的节点分类任务上,并且可以深入不受损害的性能。
translated by 谷歌翻译
Graph Neural Networks (GNNs), originally proposed for node classification, have also motivated many recent works on edge prediction (a.k.a., link prediction). However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i.e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes. To this end, we propose a novel edge prediction paradigm named Edge-aware Message PassIng neuRal nEtworks (EMPIRE). Concretely, we first introduce an edge splitting technique to specify use of each edge where each edge is solely used as either the topology or the supervision (named as topology edge or supervision edge). We then develop a new message passing mechanism that generates the messages to source nodes (through topology edges) being aware of target nodes (through supervision edges). In order to emphasize the differences between pairs connected by supervision edges and pairs unconnected, we further weight the messages to highlight the relative ones that can reflect the differences. In addition, we design a novel negative node-pair sampling trick that efficiently samples 'hard' negative instances in the supervision instances, and can significantly improve the performance. Experimental results verify that the proposed method can significantly outperform existing state-of-the-art models regarding the edge prediction task on multiple homogeneous and heterogeneous graph datasets.
translated by 谷歌翻译
Graphs are ubiquitous in nature and can therefore serve as models for many practical but also theoretical problems. For this purpose, they can be defined as many different types which suitably reflect the individual contexts of the represented problem. To address cutting-edge problems based on graph data, the research field of Graph Neural Networks (GNNs) has emerged. Despite the field's youth and the speed at which new models are developed, many recent surveys have been published to keep track of them. Nevertheless, it has not yet been gathered which GNN can process what kind of graph types. In this survey, we give a detailed overview of already existing GNNs and, unlike previous surveys, categorize them according to their ability to handle different graph types and properties. We consider GNNs operating on static and dynamic graphs of different structural constitutions, with or without node or edge attributes. Moreover, we distinguish between GNN models for discrete-time or continuous-time dynamic graphs and group the models according to their architecture. We find that there are still graph types that are not or only rarely covered by existing GNN models. We point out where models are missing and give potential reasons for their absence.
translated by 谷歌翻译
保持个人特征和复杂的关系,广泛利用和研究了图表数据。通过更新和聚合节点的表示,能够捕获结构信息,图形神经网络(GNN)模型正在获得普及。在财务背景下,该图是基于实际数据构建的,这导致复杂的图形结构,因此需要复杂的方法。在这项工作中,我们在最近的财务环境中对GNN模型进行了全面的审查。我们首先将普通使用的财务图分类并总结每个节点的功能处理步骤。然后,我们总结了每个地图类型的GNN方法,每个区域的应用,并提出一些潜在的研究领域。
translated by 谷歌翻译
Graph neural network, as a powerful graph representation technique based on deep learning, has shown superior performance and attracted considerable research interest. However, it has not been fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links. The heterogeneity and rich semantic information bring great challenges for designing a graph neural network for heterogeneous graph. Recently, one of the most exciting advancements in deep learning is the attention mechanism, whose great potential has been well demonstrated in various areas. In this paper, we first propose a novel heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions. Specifically, the node-level attention aims to learn the importance between a node and its metapath based neighbors, while the semantic-level attention is able to learn the importance of different meta-paths. With the learned importance from both node-level and semantic-level attention, the importance of node and meta-path can be fully considered. Then the proposed model can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner. Extensive experimental results on three real-world heterogeneous graphs not only show the superior performance of our proposed model over the state-of-the-arts, but also demonstrate its potentially good interpretability for graph analysis.
translated by 谷歌翻译
本文旨在为多尺度帧卷积提供一种新颖的光谱图神经网络设计。在光谱范例中,光谱GNN通过提出频谱域中的各种光谱滤波器来提高图形学习任务性能,以捕获全局和本地图形结构信息。虽然现有的光谱方法在某些图表中显示出卓越的性能,但是当图表信息不完整或扰乱时,它们患有缺乏灵活性并脆弱。我们的新帧卷曲卷积包括直接在光谱域中设计的过滤功能,以克服这些限制。所提出的卷积在切断光谱信息中表现出具有很大的灵活性,并有效地减轻了噪声曲线图信号的负效应。此外,为了利用现实世界图数据中的异质性,具有我们新的帧卷积的异构图形神经网络提供了一种用于将元路径的内在拓扑信息与多级图分析嵌入的解决方案。进行了扩展实验实现了具有嘈杂节点特征和卓越性能结果的设置下的现实异构图和均匀图。
translated by 谷歌翻译
社交机器人被称为社交网络上的自动帐户,这些帐户试图像人类一样行事。尽管图形神经网络(GNNS)已大量应用于社会机器人检测领域,但大量的领域专业知识和先验知识大量参与了最先进的方法,以设计专门的神经网络体系结构,以设计特定的神经网络体系结构。分类任务。但是,在模型设计中涉及超大的节点和网络层,通常会导致过度平滑的问题和缺乏嵌入歧视。在本文中,我们提出了罗斯加斯(Rosgas),这是一种新颖的加强和自我监督的GNN Architecture搜索框架,以适应性地指出了最合适的多跳跃社区和GNN体系结构中的层数。更具体地说,我们将社交机器人检测问题视为以用户为中心的子图嵌入和分类任务。我们利用异构信息网络来通过利用帐户元数据,关系,行为特征和内容功能来展示用户连接。 Rosgas使用多代理的深钢筋学习(RL)机制来导航最佳邻域和网络层的搜索,以分别学习每个目标用户的子图嵌入。开发了一种用于加速RL训练过程的最接近的邻居机制,Rosgas可以借助自我监督的学习来学习更多的判别子图。 5个Twitter数据集的实验表明,Rosgas在准确性,训练效率和稳定性方面优于最先进的方法,并且在处理看不见的样本时具有更好的概括。
translated by 谷歌翻译
In recent years, semi-supervised graph learning with data augmentation (DA) is currently the most commonly used and best-performing method to enhance model robustness in sparse scenarios with few labeled samples. Differing from homogeneous graph, DA in heterogeneous graph has greater challenges: heterogeneity of information requires DA strategies to effectively handle heterogeneous relations, which considers the information contribution of different types of neighbors and edges to the target nodes. Furthermore, over-squashing of information is caused by the negative curvature that formed by the non-uniformity distribution and strong clustering in complex graph. To address these challenges, this paper presents a novel method named Semi-Supervised Heterogeneous Graph Learning with Multi-level Data Augmentation (HG-MDA). For the problem of heterogeneity of information in DA, node and topology augmentation strategies are proposed for the characteristics of heterogeneous graph. And meta-relation-based attention is applied as one of the indexes for selecting augmented nodes and edges. For the problem of over-squashing of information, triangle based edge adding and removing are designed to alleviate the negative curvature and bring the gain of topology. Finally, the loss function consists of the cross-entropy loss for labeled data and the consistency regularization for unlabeled data. In order to effectively fuse the prediction results of various DA strategies, the sharpening is used. Existing experiments on public datasets, i.e., ACM, DBLP, OGB, and industry dataset MB show that HG-MDA outperforms current SOTA models. Additionly, HG-MDA is applied to user identification in internet finance scenarios, helping the business to add 30% key users, and increase loans and balances by 3.6%, 11.1%, and 9.8%.
translated by 谷歌翻译
大量越来越复杂的网络威胁是吸引了对网络安全的关注,许多挑战仍未得到解决。即,对于入侵检测,需要更强大,有效,能够使用更多信息的新算法。此外,入侵检测任务面临着与正常和恶意流量之间的极端类别不平衡相关的严重挑战。最近,图形 - 神经网络(GNN)实现了最先进的性能,以在网络安全任务中模拟网络拓扑。但是,使用GNN只有少数作品来解决入侵检测问题。此外,还探索了其他有前途的途径,例如应用注意机制。本文介绍了两种基于图形的入侵检测解决方案,改进的电子图形和电子ResgAtthorithms分别依赖于已建立的Graphsage和Cablent Network网络(GAT)。关键的想法是将剩余学习集成到利用可用图信息的GNN中。剩余连接作为处理高级不平衡的策略,旨在保留原始信息并提高少数群体课程的表现。最近四个入侵检测数据集的广泛实验评估显示了我们方法的优异性能,特别是在预测少数阶级时。
translated by 谷歌翻译
Nowadays, Multi-purpose Messaging Mobile App (MMMA) has become increasingly prevalent. MMMAs attract fraudsters and some cybercriminals provide support for frauds via black market accounts (BMAs). Compared to fraudsters, BMAs are not directly involved in frauds and are more difficult to detect. This paper illustrates our BMA detection system SGRL (Self-supervised Graph Representation Learning) used in WeChat, a representative MMMA with over a billion users. We tailor Graph Neural Network and Graph Self-supervised Learning in SGRL for BMA detection. The workflow of SGRL contains a pretraining phase that utilizes structural information, node attribute information and available human knowledge, and a lightweight detection phase. In offline experiments, SGRL outperforms state-of-the-art methods by 16.06%-58.17% on offline evaluation measures. We deploy SGRL in the online environment to detect BMAs on the billion-scale WeChat graph, and it exceeds the alternative by 7.27% on the online evaluation measure. In conclusion, SGRL can alleviate label reliance, generalize well to unseen data, and effectively detect BMAs in WeChat.
translated by 谷歌翻译