图对比度学习(GCL)改善了图表的学习,从而导致SOTA在各种下游任务上。图扩大步骤是GCL的重要但几乎没有研究的步骤。在本文中,我们表明,通过图表增强获得的节点嵌入是高度偏差的,在某种程度上限制了从学习下游任务的学习区分特征的对比模型。隐藏功能(功能增强)。受到所谓矩阵草图的启发,我们提出了Costa,这是GCL的一种新颖的协变功能空间增强框架,该框架通过维护原始功能的``好草图''来生成增强功能。为了强调Costa的特征增强功能的优势,我们研究了一个保存记忆和计算的单视图设置(除了多视图ONE)。我们表明,与基于图的模型相比,带有Costa的功能增强功能可比较/更好。
translated by 谷歌翻译
Recently, contrastive learning (CL) has emerged as a successful method for unsupervised graph representation learning. Most graph CL methods first perform stochastic augmentation on the input graph to obtain two graph views and maximize the agreement of representations in the two views. Despite the prosperous development of graph CL methods, the design of graph augmentation schemes-a crucial component in CL-remains rarely explored. We argue that the data augmentation schemes should preserve intrinsic structures and attributes of graphs, which will force the model to learn representations that are insensitive to perturbation on unimportant nodes and edges. However, most existing methods adopt uniform data augmentation schemes, like uniformly dropping edges and uniformly shuffling features, leading to suboptimal performance. In this paper, we propose a novel graph contrastive representation learning method with adaptive augmentation that incorporates various priors for topological and semantic aspects of the graph. Specifically, on the topology level, we design augmentation schemes based on node centrality measures to highlight important connective structures. On the node attribute level, we corrupt node features by adding more noise to unimportant node features, to enforce the model to recognize underlying semantic information. We perform extensive experiments of node classification on a variety of real-world datasets. Experimental results demonstrate that our proposed method consistently outperforms existing state-of-the-art baselines and even surpasses some supervised counterparts, which validates the effectiveness of the proposed contrastive framework with adaptive augmentation. CCS CONCEPTS• Computing methodologies → Unsupervised learning; Neural networks; Learning latent representations.
translated by 谷歌翻译
Although augmentations (e.g., perturbation of graph edges, image crops) boost the efficiency of Contrastive Learning (CL), feature level augmentation is another plausible, complementary yet not well researched strategy. Thus, we present a novel spectral feature argumentation for contrastive learning on graphs (and images). To this end, for each data view, we estimate a low-rank approximation per feature map and subtract that approximation from the map to obtain its complement. This is achieved by the proposed herein incomplete power iteration, a non-standard power iteration regime which enjoys two valuable byproducts (under mere one or two iterations): (i) it partially balances spectrum of the feature map, and (ii) it injects the noise into rebalanced singular values of the feature map (spectral augmentation). For two views, we align these rebalanced feature maps as such an improved alignment step can focus more on less dominant singular values of matrices of both views, whereas the spectral augmentation does not affect the spectral angle alignment (singular vectors are not perturbed). We derive the analytical form for: (i) the incomplete power iteration to capture its spectrum-balancing effect, and (ii) the variance of singular values augmented implicitly by the noise. We also show that the spectral augmentation improves the generalization bound. Experiments on graph/image datasets show that our spectral feature augmentation outperforms baselines, and is complementary with other augmentation strategies and compatible with various contrastive losses.
translated by 谷歌翻译
对比度学习是图表学习中的有效无监督方法,对比度学习的关键组成部分在于构建正和负样本。以前的方法通常利用图中节点的接近度作为原理。最近,基于数据增强的对比度学习方法已进步以显示视觉域中的强大力量,一些作品将此方法从图像扩展到图形。但是,与图像上的数据扩展不同,图上的数据扩展远不那么直观,而且很难提供高质量的对比样品,这为改进留出了很大的空间。在这项工作中,通过引入一个对抗性图视图以进行数据增强,我们提出了一种简单但有效的方法,对抗图对比度学习(ARIEL),以在合理的约束中提取信息性的对比样本。我们开发了一种称为稳定训练的信息正则化的新技术,并使用子图抽样以进行可伸缩。我们通过将每个图形实例视为超级节点,从节点级对比度学习到图级。 Ariel始终优于在现实世界数据集上的节点级别和图形级分类任务的当前图对比度学习方法。我们进一步证明,面对对抗性攻击,Ariel更加强大。
translated by 谷歌翻译
Contrastive learning methods based on InfoNCE loss are popular in node representation learning tasks on graph-structured data. However, its reliance on data augmentation and its quadratic computational complexity might lead to inconsistency and inefficiency problems. To mitigate these limitations, in this paper, we introduce a simple yet effective contrastive model named Localized Graph Contrastive Learning (Local-GCL in short). Local-GCL consists of two key designs: 1) We fabricate the positive examples for each node directly using its first-order neighbors, which frees our method from the reliance on carefully-designed graph augmentations; 2) To improve the efficiency of contrastive learning on graphs, we devise a kernelized contrastive loss, which could be approximately computed in linear time and space complexity with respect to the graph size. We provide theoretical analysis to justify the effectiveness and rationality of the proposed methods. Experiments on various datasets with different scales and properties demonstrate that in spite of its simplicity, Local-GCL achieves quite competitive performance in self-supervised node representation learning tasks on graphs with various scales and properties.
translated by 谷歌翻译
对比度学习是图表学习中有效的无监督方法。最近,基于数据增强的对比度学习方法已从图像扩展到图形。但是,大多数先前的作品都直接根据为图像设计的模型进行了调整。与图像上的数据增强不同,图表上的数据扩展远不那么直观,而且很难提供高质量的对比样本,这是对比度学习模型的性能的关键。这为改进现有图形对比学习框架留出了很多空间。在这项工作中,通过引入对抗图视图和信息正常化程序,我们提出了一种简单但有效的方法,即对逆向对比度学习(ARIEL),以在合理的约束中提取信息性的对比样本。它始终优于各种现实世界数据集的节点分类任务中当前的图形对比度学习方法,并进一步提高了图对比度学习的鲁棒性。
translated by 谷歌翻译
Inspired by the impressive success of contrastive learning (CL), a variety of graph augmentation strategies have been employed to learn node representations in a self-supervised manner. Existing methods construct the contrastive samples by adding perturbations to the graph structure or node attributes. Although impressive results are achieved, it is rather blind to the wealth of prior information assumed: with the increase of the perturbation degree applied on the original graph, 1) the similarity between the original graph and the generated augmented graph gradually decreases; 2) the discrimination between all nodes within each augmented view gradually increases. In this paper, we argue that both such prior information can be incorporated (differently) into the contrastive learning paradigm following our general ranking framework. In particular, we first interpret CL as a special case of learning to rank (L2R), which inspires us to leverage the ranking order among positive augmented views. Meanwhile, we introduce a self-ranking paradigm to ensure that the discriminative information among different nodes can be maintained and also be less altered to the perturbations of different degrees. Experiment results on various benchmark datasets verify the effectiveness of our algorithm compared with the supervised and unsupervised models.
translated by 谷歌翻译
Existing graph contrastive learning methods rely on augmentation techniques based on random perturbations (e.g., randomly adding or dropping edges and nodes). Nevertheless, altering certain edges or nodes can unexpectedly change the graph characteristics, and choosing the optimal perturbing ratio for each dataset requires onerous manual tuning. In this paper, we introduce Implicit Graph Contrastive Learning (iGCL), which utilizes augmentations in the latent space learned from a Variational Graph Auto-Encoder by reconstructing graph topological structure. Importantly, instead of explicitly sampling augmentations from latent distributions, we further propose an upper bound for the expected contrastive loss to improve the efficiency of our learning algorithm. Thus, graph semantics can be preserved within the augmentations in an intelligent way without arbitrary manual design or prior human knowledge. Experimental results on both graph-level and node-level tasks show that the proposed method achieves state-of-the-art performance compared to other benchmarks, where ablation studies in the end demonstrate the effectiveness of modules in iGCL.
translated by 谷歌翻译
图对比度学习(GCL)一直是图形自学学习的新兴解决方案。 GCL的核心原理是在正视图中降低样品之间的距离,但在负视图中增加样品之间的距离。在实现有希望的性能的同时,当前的GCL方法仍然受到两个局限性:(1)增强的不可控制的有效性,该图扰动可能会产生针对语义和图形数据的特征流程的无效视图; (2)不可靠的二进制对比理由,对于非欧几里得图数据而言,难以确定构造观点的积极性和负面性。为了应对上述局限性,我们提出了一个新的对比度学习范式,即图形软对比度学习(GSCL),该范例通过排名的社区无需任何增强和二进制对比符合性,在较细性的范围内进行对比度学习。 GSCL建立在图接近的基本假设上,即连接的邻居比遥远的节点更相似。具体而言,我们在配对和列表的封闭式排名中,以保留附近的相对排名关系。此外,随着邻里规模的指数增长,考虑了更多的啤酒花,我们提出了提高学习效率的邻里抽样策略。广泛的实验结果表明,我们提出的GSCL可以始终如一地在各种公共数据集上实现与GCL相当复杂的各种公共数据集的最新性能。
translated by 谷歌翻译
Contrastive learning (CL), which can extract the information shared between different contrastive views, has become a popular paradigm for vision representation learning. Inspired by the success in computer vision, recent work introduces CL into graph modeling, dubbed as graph contrastive learning (GCL). However, generating contrastive views in graphs is more challenging than that in images, since we have little prior knowledge on how to significantly augment a graph without changing its labels. We argue that typical data augmentation techniques (e.g., edge dropping) in GCL cannot generate diverse enough contrastive views to filter out noises. Moreover, previous GCL methods employ two view encoders with exactly the same neural architecture and tied parameters, which further harms the diversity of augmented views. To address this limitation, we propose a novel paradigm named model augmented GCL (MA-GCL), which will focus on manipulating the architectures of view encoders instead of perturbing graph inputs. Specifically, we present three easy-to-implement model augmentation tricks for GCL, namely asymmetric, random and shuffling, which can respectively help alleviate high- frequency noises, enrich training instances and bring safer augmentations. All three tricks are compatible with typical data augmentations. Experimental results show that MA-GCL can achieve state-of-the-art performance on node classification benchmarks by applying the three tricks on a simple base model. Extensive studies also validate our motivation and the effectiveness of each trick. (Code, data and appendix are available at https://github.com/GXM1141/MA-GCL. )
translated by 谷歌翻译
尽管图表学习(GRL)取得了重大进展,但要以足够的方式提取和嵌入丰富的拓扑结构和特征信息仍然是一个挑战。大多数现有方法都集中在本地结构上,并且无法完全融合全球拓扑结构。为此,我们提出了一种新颖的结构保留图表学习(SPGRL)方法,以完全捕获图的结构信息。具体而言,为了减少原始图的不确定性和错误信息,我们通过k-nearest邻居方法构建了特征图作为互补视图。该特征图可用于对比节点级别以捕获本地关系。此外,我们通过最大化整个图形和特征嵌入的相互信息(MI)来保留全局拓扑结构信息,从理论上讲,该信息可以简化为交换功能的特征嵌入和原始图以重建本身。广泛的实验表明,我们的方法在半监督节点分类任务上具有相当出色的性能,并且在图形结构或节点特征上噪声扰动下的鲁棒性出色。
translated by 谷歌翻译
尽管有关超图的机器学习吸引了很大的关注,但大多数作品都集中在(半)监督的学习上,这可能会导致繁重的标签成本和不良的概括。最近,对比学习已成为一种成功的无监督表示学习方法。尽管其他领域中对比度学习的发展繁荣,但对超图的对比学习仍然很少探索。在本文中,我们提出了Tricon(三个方向对比度学习),这是对超图的对比度学习的一般框架。它的主要思想是三个方向对比度,具体来说,它旨在在两个增强视图中最大化同一节点之间的协议(a),(b)在同一节点之间以及(c)之间,每个组之间的成员及其成员之间的协议(b) 。加上简单但令人惊讶的有效数据增强和负抽样方案,这三种形式的对比使Tricon能够在节点嵌入中捕获显微镜和介观结构信息。我们使用13种基线方法,5个数据集和两个任务进行了广泛的实验,这证明了Tricon的有效性,最明显的是,Tricon始终优于无监督的竞争对手,而且(半)受监督的竞争对手,大多数是由大量的节点分类的大量差额。
translated by 谷歌翻译
自我监督的学习提供了一个有希望的途径,消除了在图形上的代表学习中的昂贵标签信息的需求。然而,为了实现最先进的性能,方法通常需要大量的负例,并依赖于复杂的增强。这可能是昂贵的,特别是对于大图。为了解决这些挑战,我们介绍了引导的图形潜伏(BGRL) - 通过预测输入的替代增强来学习图表表示学习方法。 BGRL仅使用简单的增强,并减轻了对否定例子对比的需求,因此通过设计可扩展。 BGRL胜过或匹配现有的几种建立的基准,同时降低了内存成本的2-10倍。此外,我们表明,BGR1可以缩放到半监督方案中的数亿个节点的极大的图表 - 实现最先进的性能并改善监督基线,其中表示仅通过标签信息而塑造。特别是,我们的解决方案以BGRL为中心,将kdd杯2021的开放图基准的大规模挑战组成了一个获奖条目,在比所有先前可用的基准更大的级别的图形订单上,从而展示了我们方法的可扩展性和有效性。
translated by 谷歌翻译
图表表示学习(GRL)对于图形结构数据分析至关重要。然而,大多数现有的图形神经网络(GNNS)严重依赖于标签信息,这通常是在现实世界中获得的昂贵。现有无监督的GRL方法遭受某些限制,例如对单调对比和可扩展性有限的沉重依赖。为了克服上述问题,鉴于最近的图表对比学习的进步,我们通过曲线图介绍了一种新颖的自我监控图形表示学习算法,即通过利用所提出的调整变焦方案来学习节点表示来学习节点表示。具体地,该机制使G-Zoom能够从多个尺度的图表中探索和提取自我监督信号:MICRO(即,节点级别),MESO(即,邻域级)和宏(即,子图级) 。首先,我们通过两个不同的图形增强生成输入图的两个增强视图。然后,我们逐渐地从节点,邻近逐渐为上述三个尺度建立三种不同的对比度,在那里我们最大限度地提高了横跨尺度的图形表示之间的协议。虽然我们可以从微距和宏观视角上从给定图中提取有价值的线索,但是邻域级对比度基于我们的调整后的缩放方案提供了可自定义选项的能力,以便手动选择位于微观和介于微观之间的最佳视点宏观透视更好地理解图数据。此外,为了使我们的模型可扩展到大图,我们采用了并行图形扩散方法来从图形尺寸下解耦模型训练。我们对现实世界数据集进行了广泛的实验,结果表明,我们所提出的模型始终始终优于最先进的方法。
translated by 谷歌翻译
在异质图上的自我监督学习(尤其是对比度学习)方法可以有效地摆脱对监督数据的依赖。同时,大多数现有的表示学习方法将异质图嵌入到欧几里得或双曲线的单个几何空间中。这种单个几何视图通常不足以观察由于其丰富的语义和复杂结构而观察到异质图的完整图片。在这些观察结果下,本文提出了一种新型的自我监督学习方法,称为几何对比度学习(GCL),以更好地表示监督数据是不可用时的异质图。 GCL同时观察了从欧几里得和双曲线观点的异质图,旨在强烈合并建模丰富的语义和复杂结构的能力,这有望为下游任务带来更多好处。 GCL通过在局部局部和局部全球语义水平上对比表示两种几何视图之间的相互信息。在四个基准数据集上进行的广泛实验表明,在三个任务上,所提出的方法在包括节点分类,节点群集和相似性搜索在内的三个任务上都超过了强基础,包括无监督的方法和监督方法。
translated by 谷歌翻译
Generalizable, transferrable, and robust representation learning on graph-structured data remains a challenge for current graph neural networks (GNNs). Unlike what has been developed for convolutional neural networks (CNNs) for image data, self-supervised learning and pre-training are less explored for GNNs. In this paper, we propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data. We first design four types of graph augmentations to incorporate various priors. We then systematically study the impact of various combinations of graph augmentations on multiple datasets, in four different settings: semi-supervised, unsupervised, and transfer learning as well as adversarial attacks. The results show that, even without tuning augmentation extents nor using sophisticated GNN architectures, our GraphCL framework can produce graph representations of similar or better generalizability, transferrability, and robustness compared to state-of-the-art methods. We also investigate the impact of parameterized graph augmentation extents and patterns, and observe further performance gains in preliminary experiments. Our codes are available at: https://github.com/Shen-Lab/GraphCL.
translated by 谷歌翻译
关于图表的深度学习最近吸引了重要的兴趣。然而,大多数作品都侧重于(半)监督学习,导致缺点包括重标签依赖,普遍性差和弱势稳健性。为了解决这些问题,通过良好设计的借口任务在不依赖于手动标签的情况下提取信息知识的自我监督学习(SSL)已成为图形数据的有希望和趋势的学习范例。与计算机视觉和自然语言处理等其他域的SSL不同,图表上的SSL具有独家背景,设计理念和分类。在图表的伞下自我监督学习,我们对采用图表数据采用SSL技术的现有方法及时及全面的审查。我们构建一个统一的框架,数学上正式地规范图表SSL的范例。根据借口任务的目标,我们将这些方法分为四类:基于生成的,基于辅助性的,基于对比的和混合方法。我们进一步描述了曲线图SSL在各种研究领域的应用,并总结了绘图SSL的常用数据集,评估基准,性能比较和开源代码。最后,我们讨论了该研究领域的剩余挑战和潜在的未来方向。
translated by 谷歌翻译
灵感来自最近应用于图像上的自我监督方法的成功,图形结构数据的自我监督学习已经看到迅速增长,特别是基于增强的对比方法。但是,我们认为没有精心设计的增强技术,图形上的增强可能是任意行为的,因为图形的底层语义可以急剧地改变。因此,现有增强的方法的性能高度依赖于增强方案的选择,即与增强相关联的超级参数。在本文中,我们提出了一种名为AFGRL的图表的一种新的增强自我监督学习框架。具体地,我们通过发现与图形共享本地结构信息和全局语义的节点来生成图表的替代视图。各种数据集的各种节点级任务,即节点分类,群集和相似性搜索的广泛实验证明了AFGRL的优越性。 AFGRL的源代码可在https://github.com/namkyeong/afgrl中获得。
translated by 谷歌翻译
随着对比学习的兴起,无人监督的图形表示学习最近一直蓬勃发展,甚至超过了一些机器学习任务中的监督对应物。图表表示的大多数对比模型学习侧重于最大化本地和全局嵌入之间的互信息,或主要取决于节点级别的对比嵌入。然而,它们仍然不足以全面探索网络拓扑的本地和全球视图。虽然前者认为本地全球关系,但其粗略的全球信息导致本地和全球观点之间的思考。后者注重节点级别对齐,以便全局视图的作用出现不起眼。为避免落入这两个极端情况,我们通过对比群集分配来提出一种新颖的无监督图形表示模型,称为GCCA。通过组合聚类算法和对比学习,它有动力综合利用本地和全球信息。这不仅促进了对比效果,而且还提供了更高质量的图形信息。同时,GCCA进一步挖掘群集级信息,这使得它能够了解除了图形拓扑之外的节点之间的难以捉摸的关联。具体地,我们首先使用不同的图形增强策略生成两个增强的图形,然后使用聚类算法分别获取其群集分配和原型。所提出的GCCA进一步强制不同增强图中的相同节点来通过最小化交叉熵损失来互相识别它们的群集分配。为了展示其有效性,我们将在三个不同的下游任务中与最先进的模型进行比较。实验结果表明,GCCA在大多数任务中具有强大的竞争力。
translated by 谷歌翻译
无监督的图形表示学习是图形数据的非琐碎主题。在结构化数据的无监督代表学习中对比学习和自我监督学习的成功激发了图表上的类似尝试。使用对比损耗的当前无监督的图形表示学习和预培训主要基于手工增强图数据之间的对比度。但是,由于不可预测的不变性,图数据增强仍然没有很好地探索。在本文中,我们提出了一种新颖的协作图形神经网络对比学习框架(CGCL),它使用多个图形编码器来观察图形。不同视图观察的特征充当了图形编码器之间对比学习的图表增强,避免了任何扰动以保证不变性。 CGCL能够处理图形级和节点级表示学习。广泛的实验表明CGCL在无监督的图表表示学习中的优势以及图形表示学习的手工数据增强组合的非必要性。
translated by 谷歌翻译