Graph neural networks (GNNs) have recently emerged as a promising learning paradigm in learning graph-structured data and have demonstrated wide success across various domains such as recommendation systems, social networks, and electronic design automation (EDA). Like other deep learning (DL) methods, GNNs are being deployed in sophisticated modern hardware systems, as well as dedicated accelerators. However, despite the popularity of GNNs and the recent efforts of bringing GNNs to hardware, the fault tolerance and resilience of GNNs has generally been overlooked. Inspired by the inherent algorithmic resilience of DL methods, this paper conducts, for the first time, a large-scale and empirical study of GNN resilience, aiming to understand the relationship between hardware faults and GNN accuracy. By developing a customized fault injection tool on top of PyTorch, we perform extensive fault injection experiments to various GNN models and application datasets. We observe that the error resilience of GNN models varies by orders of magnitude with respect to different models and application datasets. Further, we explore a low-cost error mitigation mechanism for GNN to enhance its resilience. This GNN resilience study aims to open up new directions and opportunities for future GNN accelerator design and architectural optimization.
translated by 谷歌翻译
As the interest to Graph Neural Networks (GNNs) is growing, the importance of benchmarking and performance characterization studies of GNNs is increasing. So far, we have seen many studies that investigate and present the performance and computational efficiency of GNNs. However, the work done so far has been carried out using a few high-level GNN frameworks. Although these frameworks provide ease of use, they contain too many dependencies to other existing libraries. The layers of implementation details and the dependencies complicate the performance analysis of GNN models that are built on top of these frameworks, especially while using architectural simulators. Furthermore, different approaches on GNN computation are generally overlooked in prior characterization studies, and merely one of the common computational models is evaluated. Based on these shortcomings and needs that we observed, we developed a benchmark suite that is framework independent, supporting versatile computational models, easily configurable and can be used with architectural simulators without additional effort. Our benchmark suite, which we call gSuite, makes use of only hardware vendor's libraries and therefore it is independent of any other frameworks. gSuite enables performing detailed performance characterization studies on GNN Inference using both contemporary GPU profilers and architectural GPU simulators. To illustrate the benefits of our new benchmark suite, we perform a detailed characterization study with a set of well-known GNN models with various datasets; running gSuite both on a real GPU card and a timing-detailed GPU simulator. We also implicate the effect of computational models on performance. We use several evaluation metrics to rigorously measure the performance of GNN computation.
translated by 谷歌翻译
图表神经网络(GNN)基于故障诊断(FD)近年来收到了越来越多的关注,因为来自来自多个应用域的数据可以有利地表示为图。实际上,与传统的FD方法相比,这种特殊的代表性表格导致了卓越的性能。在本次审查中,给出了GNN,对故障诊断领域的潜在应用以及未来观点的简单介绍。首先,通过专注于它们的数据表示,即时间序列,图像和图形,回顾基于神经网络的FD方法。其次,引入了GNN的基本原则和主要架构,注意了图形卷积网络,图注意网络,图形样本和聚合,图形自动编码器和空间 - 时间图卷积网络。第三,通过详细实验验证基于GNN的最相关的故障诊断方法,结论是基于GNN的方法可以实现良好的故障诊断性能。最后,提供了讨论和未来的挑战。
translated by 谷歌翻译
深神经网络(DNNS)的研究重点是提高现实部署的性能和准确性,导致新模型,例如尖峰神经网络(SNNS)以及优化技术,例如压缩网络的量化和修剪。但是,这些创新模型和优化技术的部署引入了可能的可靠性问题,这是DNNS在安全至关重要应用中广泛使用的支柱,例如自主驾驶。此外,缩放技术节点具有同时发生多个故障的相关风险,在最新的弹性分析中未解决。为了对DNN的更好可靠性分析,我们提出了Enpheeph,这是用于尖峰和压缩DNN的断层注入框架。 Enpheeph框架可以在专用硬件设备(例如GPU)上进行优化的执行,同时提供完整的自定义性来研究不同的故障模型,从而模拟各种可靠性约束和用例。因此,这些故障可以在SNN上执行,以及对基础代码进行最小化修改的压缩网络,这一壮举是其他最先进的工具无法实现的。为了评估我们的Enpheeph框架,我们通过不同的压缩技术分析了不同DNN和SNN模型的弹性。通过注射随机和增加的故障,我们表明DNN可以显示出每个参数的断层率低至7 x 10 ^(-7)故障的准确性降低,精度下降高于40%。当执行ENPHEEPH时,运行时间开销不到基线执行时间的20%,同时执行100 000个故障,至少比最新的框架低10倍,从而使Enpheeph Future-Proffure-Future-Profforn用于复杂的故障注入方案。我们在https://github.com/alexei95/enpheeph上发布Enpheeph。
translated by 谷歌翻译
图形神经网络(GNNS)依赖于图形结构来定义聚合策略,其中每个节点通过与邻居的信息组合来更新其表示。已知GNN的限制是,随着层数的增加,信息被平滑,压扁并且节点嵌入式变得无法区分,对性能产生负面影响。因此,实用的GNN模型雇用了几层,只能在每个节点周围的有限邻域利用图形结构。不可避免地,实际的GNN不会根据图的全局结构捕获信息。虽然有几种研究GNNS的局限性和表达性,但是关于图形结构数据的实际应用的问题需要全局结构知识,仍然没有答案。在这项工作中,我们通过向几个GNN模型提供全球信息并观察其对下游性能的影响来认证解决这个问题。我们的研究结果表明,全球信息实际上可以为共同的图形相关任务提供显着的好处。我们进一步确定了一项新的正规化策略,导致所有考虑的任务的平均准确性提高超过5%。
translated by 谷歌翻译
图形神经网络(GNN)在处理图形结构数据的问题上表现出巨大的希望。 GNNS的独特点之一是它们的灵活性适应多个问题,这不仅导致广泛的适用性,而且在为特定问题找到最佳模型或加速技术时会带来重要的挑战。此类挑战的一个例子在于一个事实,即GNN模型或加速技术的准确性或有效性通常取决于基础图的结构。在本文中,为了解决图形依赖性加速的问题,我们提出了预后,这是一个数据驱动的模型,可以通过检查输入图来预测给定GNN模型在任意特征图上运行的GNN训练时间指标。这样的预测是基于先前使用多样化的合成图数据集经过离线训练的回归做出的。在实践中,我们的方法允许做出明智的决定,以用于特定问题的设计。在本文中,为特定用例定义并应用了构建预后的方法,其中有助于确定哪种图表更好。我们的结果表明,预后有助于在多种广泛使用的GNN模型(例如GCN,GIN,GAT或GRAPHSAGE)中随机选择图表的平均速度为1.22倍。
translated by 谷歌翻译
Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i.e., untrained networks). However, the presence of such untrained subnetworks in graph neural networks (GNNs) still remains mysterious. In this paper we carry out the first-of-its-kind exploration of discovering matching untrained GNNs. With sparsity as the core tool, we can find \textit{untrained sparse subnetworks} at the initialization, that can match the performance of \textit{fully trained dense} GNNs. Besides this already encouraging finding of comparable performance, we show that the found untrained subnetworks can substantially mitigate the GNN over-smoothing problem, hence becoming a powerful tool to enable deeper GNNs without bells and whistles. We also observe that such sparse untrained subnetworks have appealing performance in out-of-distribution detection and robustness of input perturbations. We evaluate our method across widely-used GNN architectures on various popular datasets including the Open Graph Benchmark (OGB).
translated by 谷歌翻译
今天,神经网络是几乎每个技术领域都有突破的基础。他们在加速器的应用最近导致这些系统的性能更好和效率。同时,需要解决由于最新(收缩)半导体技术导致的硬件故障增加。由于加速器系统通常用于背对自动驾驶汽车或医学诊断应用的时间关键应用,因此必须消除这些硬件故障。我们的研究从系统的角度评估了这些失败。根据我们的结果,我们为系统可靠性增强找到了关键结果,我们进一步提出了一种有效的方法,以避免使用最小硬件开销的这些故障。
translated by 谷歌翻译
Graph neural networks (GNNs) have been widely used under semi-supervised settings. Prior studies have mainly focused on finding appropriate graph filters (e.g., aggregation schemes) to generalize well for both homophilic and heterophilic graphs. Even though these approaches are essential and effective, they still suffer from the sparsity in initial node features inherent in the bag-of-words representation. Common in semi-supervised learning where the training samples often fail to cover the entire dimensions of graph filters (hyperplanes), this can precipitate over-fitting of specific dimensions in the first projection matrix. To deal with this problem, we suggest a simple and novel strategy; create additional space by flipping the initial features and hyperplane simultaneously. Training in both the original and in the flip space can provide precise updates of learnable parameters. To the best of our knowledge, this is the first attempt that effectively moderates the overfitting problem in GNN. Extensive experiments on real-world datasets demonstrate that the proposed technique improves the node classification accuracy up to 40.2 %
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.
translated by 谷歌翻译
Graphs are ubiquitous in nature and can therefore serve as models for many practical but also theoretical problems. For this purpose, they can be defined as many different types which suitably reflect the individual contexts of the represented problem. To address cutting-edge problems based on graph data, the research field of Graph Neural Networks (GNNs) has emerged. Despite the field's youth and the speed at which new models are developed, many recent surveys have been published to keep track of them. Nevertheless, it has not yet been gathered which GNN can process what kind of graph types. In this survey, we give a detailed overview of already existing GNNs and, unlike previous surveys, categorize them according to their ability to handle different graph types and properties. We consider GNNs operating on static and dynamic graphs of different structural constitutions, with or without node or edge attributes. Moreover, we distinguish between GNN models for discrete-time or continuous-time dynamic graphs and group the models according to their architecture. We find that there are still graph types that are not or only rarely covered by existing GNN models. We point out where models are missing and give potential reasons for their absence.
translated by 谷歌翻译
As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs. Prior work primarily focuses on metrics such as Failures In Time (FIT) rate and the Silent Data Corruption (SDC) rate, which quantify how often a device fails. Instead, this paper focuses on quantifying the DNN accuracy given that a transient error has occurred, which tells us how well a network behaves when a transient error occurs. We call this metric Resiliency Accuracy (RA). We show that existing RA formulation is fundamentally inaccurate, because it incorrectly assumes that software variables (model weights/activations) have equal faulty probability under hardware transient faults. We present an algorithm that captures the faulty probabilities of DNN variables under transient faults and, thus, provides correct RA estimations validated by hardware. To accelerate RA estimation, we reformulate RA calculation as a Monte Carlo integration problem, and solve it using importance sampling driven by DNN specific heuristics. Using our lightweight RA estimation method, we show that transient faults lead to far greater accuracy degradation than what todays DNN resiliency tools estimate. We show how our RA estimation tool can help design more resilient DNNs by integrating it with a Network Architecture Search framework.
translated by 谷歌翻译
学术界和工业广泛研究了图形机器学习。然而,作为图表学习繁荣的文献,具有大量的新兴方法和技术,它越来越难以手动设计用于不同的图形相关任务的最佳机器学习算法。为了解决挑战,自动化图形机器学习,目的是在没有手动设计的不同图表任务/数据中发现最好的图形任务/数据的最佳超参数和神经架构配置,正在增加研究界的越来越多的关注。在本文中,我们广泛地讨论了自动化图形机方法,涵盖了用于图形机学习的超参数优化(HPO)和神经架构搜索(NAS)。我们简要概述了专为Traph Machine学习或自动化机器学习而设计的现有库,进一步深入介绍AutoGL,我们的专用和世界上第一个用于自动图形机器学习的开放源库。最后但并非最不重要的是,我们分享了对自动图形机学习的未来研究方向的见解。本文是对自动图形机学习的方法,图书馆以及方向的第一个系统和全面讨论。
translated by 谷歌翻译
图表可以模拟实体之间的复杂交互,它在许多重要的应用程序中自然出现。这些应用程序通常可以投入到标准图形学习任务中,其中关键步骤是学习低维图表示。图形神经网络(GNN)目前是嵌入方法中最受欢迎的模型。然而,邻域聚合范例中的标准GNN患有区分\ EMPH {高阶}图形结构的有限辨别力,而不是\ EMPH {低位}结构。为了捕获高阶结构,研究人员求助于主题和开发的基于主题的GNN。然而,现有的基于主基的GNN仍然仍然遭受较少的辨别力的高阶结构。为了克服上述局限性,我们提出了一个新颖的框架,以更好地捕获高阶结构的新框架,铰接于我们所提出的主题冗余最小化操作员和注射主题组合的新颖框架。首先,MGNN生成一组节点表示W.R.T.每个主题。下一阶段是我们在图案中提出的冗余最小化,该主题在彼此相互比较并蒸馏出每个主题的特征。最后,MGNN通过组合来自不同图案的多个表示来执行节点表示的更新。特别地,为了增强鉴别的功率,MGNN利用重新注射功能来组合表示的函数w.r.t.不同的主题。我们进一步表明,我们的拟议体系结构增加了GNN的表现力,具有理论分析。我们展示了MGNN在节点分类和图形分类任务上的七个公共基准上表现出最先进的方法。
translated by 谷歌翻译
基于学习的导航系统广泛用于自主应用,例如机器人,无人驾驶车辆和无人机。已经提出了专门的硬件加速器,以实现这种导航任务的高性能和能效。然而,硬件系统中的瞬态和永久性故障正在增加,并且可以灾难性地违反任务安全。同时,传统的基于冗余的保护方法挑战,用于部署资源受限的边缘应用。在本文中,我们通过从RL训练和推理的算法,对算法,故障模型和数据类型进行了实验评估导航系统的恢复性。我们进一步提出了两种有效的故障缓解技术,实现了基于学习的导航系统的2倍成功率和39%的飞行质量改进。
translated by 谷歌翻译
图形神经网络(GNN)是具有无核数据的应用的有前途的方法。但是,具有数亿节点的大规模图上的培训GNN既是资源又是耗时的。与DNN不同,GNN通常具有更大的内存足迹,因此GPU内存能力和PCIE带宽是GNN培训中的主要资源瓶颈。为了解决此问题,我们提出分叉:一种图形量化方法,通过显着减少内存足迹和PCIE带宽要求来加速GNN训练,以便GNN可以充分利用GPU计算功能。我们的关键见解是,与DNN不同,GNN不太容易发生量化引起的输入特征的信息丢失。我们确定图形特征量化中的主要准确性影响因素,从理论上证明,分叉训练会收敛到网络,在该网络中,损失在未压缩网络的最佳损失的$ \ epsilon $之内。我们使用几种流行的GNN模型和数据集对分叉进行了广泛的评估,包括最大的公共图数据集MAG240M上的图形。结果表明,分叉达到30以上的压缩率,并在边际准确性损失的情况下提高了GNN训练速度200%-320%。特别是,分叉在一小时内仅使用四个GPU在MAG240M上的训练图来实现记录。
translated by 谷歌翻译
最近,作为基于图形机器学习的骨干的图形神经网络(GNN)展示了各个域(例如,电子商务)的巨大成功。然而,由于基于高稀疏和不规则的图形操作,GNN的性能通常不令人满意。为此,我们提出,TC-GNN,基于GNN加速框架的第一个GPU张量核心单元(TCU)。核心思想是将“稀疏”GNN计算与“密集”TCU进行调和。具体地,我们对主流GNN计算框架中的稀疏操作进行了深入的分析。我们介绍了一种新颖的稀疏图翻译技术,便于TCU处理稀疏GNN工作量。我们还实现了一个有效的CUDA核心和TCU协作设计,以充分利用GPU资源。我们将TC-GNN与Pytorch框架完全集成,以便于编程。严格的实验在各种GNN型号和数据集设置的最先进的深图库框架上平均显示了1.70倍的加速。
translated by 谷歌翻译
Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, substantial research efforts have been devoted to applying deep learning methods to graphs, resulting in beneficial advances in graph analysis techniques. In this survey, we comprehensively review the different types of deep learning methods on graphs. We divide the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks, graph autoencoders, graph reinforcement learning, and graph adversarial methods. We then provide a comprehensive overview of these methods in a systematic manner mainly by following their development history. We also analyze the differences and compositions of different methods. Finally, we briefly outline the applications in which they have been used and discuss potential future research directions.
translated by 谷歌翻译
通信网络是当代社会中的重要基础设施。仍存在许多挑战,在该活性研究区域中不断提出新的解决方案。近年来,为了模拟网络拓扑,基于图形的深度学习在通信网络中的一系列问题中实现了最先进的性能。在本调查中,我们使用基于不同的图形的深度学习模型来审查快速增长的研究机构,例如,使用不同的图形深度学习模型。图表卷积和曲线图注意网络,在不同类型的通信网络中的各种问题中,例如,无线网络,有线网络和软件定义的网络。我们还为每项研究提供了一个有组织的问题和解决方案列表,并确定了未来的研究方向。据我们所知,本文是第一个专注于在涉及有线和无线场景的通信网络中应用基于图形的深度学习方法的调查。要跟踪后续研究,创建了一个公共GitHub存储库,其中相关文件将不断更新。
translated by 谷歌翻译