在这项工作中,我们提出了一种神经方法,用于重建描述层次相互作用的生根树图,使用新颖的表示,我们将其称为最低的共同祖先世代(LCAG)矩阵。这种紧凑的配方等效于邻接矩阵,但是如果直接使用邻接矩阵,则可以单独从叶子中学习树的结构,而无需先前的假设。因此,采用LCAG启用了第一个端到端的可训练解决方案,该解决方案仅使用末端树叶直接学习不同树大小的层次结构。在高能量粒子物理学的情况下,粒子衰减形成了分层树结构,只能通过实验观察到最终产物,并且可能的树的大型组合空间使分析溶液变得很棘手。我们证明了LCAG用作使用变压器编码器和神经关系编码器编码器图神经网络的模拟粒子物理衰减结构的任务。采用这种方法,我们能够正确预测LCAG纯粹是从叶子特征中的LCAG,最大树深度为$ 8 $ in $ 92.5 \%\%的树木箱子,最高$ 6 $叶子(包括)和$ 59.7 \%\%\%\%的树木$在我们的模拟数据集中$ 10 $。
translated by 谷歌翻译
即使机器学习算法已经在数据科学中发挥了重要作用,但许多当前方法对输入数据提出了不现实的假设。由于不兼容的数据格式,或数据集中的异质,分层或完全缺少的数据片段,因此很难应用此类方法。作为解决方案,我们提出了一个用于样本表示,模型定义和培训的多功能,统一的框架,称为“ Hmill”。我们深入审查框架构建和扩展的机器学习的多个范围范式。从理论上讲,为HMILL的关键组件的设计合理,我们将通用近似定理的扩展显示到框架中实现的模型所实现的所有功能的集合。本文还包含有关我们实施中技术和绩效改进的详细讨论,该讨论将在MIT许可下发布供下载。该框架的主要资产是其灵活性,它可以通过相同的工具对不同的现实世界数据源进行建模。除了单独观察到每个对象的一组属性的标准设置外,我们解释了如何在框架中实现表示整个对象系统的图表中的消息推断。为了支持我们的主张,我们使用框架解决了网络安全域的三个不同问题。第一种用例涉及来自原始网络观察结果的IoT设备识别。在第二个问题中,我们研究了如何使用以有向图表示的操作系统的快照可以对恶意二进制文件进行分类。最后提供的示例是通过网络中实体之间建模域黑名单扩展的任务。在所有三个问题中,基于建议的框架的解决方案可实现与专业方法相当的性能。
translated by 谷歌翻译
最近的工作已经证明了图形神经网络(GNN)等几何深度学习方法非常适合于在高能粒子物理学中解决各种重建问题。特别地,粒子跟踪数据通过识别硅跟踪器命中作为节点和粒子轨迹作为边缘来自然表示为曲线图;给定一组假设的边缘,边缘分类GNN标识与真实粒子轨迹相对应的那些。在这项工作中,我们将物理激励的相互作用网络(IN)GNN调整为与高亮度大强子撞机的预期相似的填充条件中的粒子跟踪问题。假设在各种粒子矩阈值下进行理想化的击中过滤,我们通过在基于GNN的跟踪的每个阶段进行了一系列测量来展示了优异的边缘分类精度和跟踪效率:图形结构,边缘分类和轨道建筑。建议的建筑基本上比以前研究的GNN跟踪架构小幅小;这尤其希望,因为大小的减小对于在受约束的计算环境中实现基于GNN的跟踪至关重要。此外,可以将其表示为一组显式矩阵操作或传递GNN的消息。正在进行努力,以通过异构计算资源朝向高级和低延迟触发应用程序加速每个表示。
translated by 谷歌翻译
在过去十年中,图形内核引起了很多关注,并在结构化数据上发展成为一种快速发展的学习分支。在过去的20年中,该领域发生的相当大的研究活动导致开发数十个图形内核,每个图形内核都对焦于图形的特定结构性质。图形内核已成功地成功地在广泛的域中,从社交网络到生物信息学。本调查的目标是提供图形内核的文献的统一视图。特别是,我们概述了各种图形内核。此外,我们对公共数据集的几个内核进行了实验评估,并提供了比较研究。最后,我们讨论图形内核的关键应用,并概述了一些仍有待解决的挑战。
translated by 谷歌翻译
我们介绍了一种从电磁(EM)采样量热计收集的数据重建多个淋浴的第一算法。这种探测器广泛用于高能量物理中,以测量进入粒子的能量和运动学。在这项工作中,我们考虑许多电子通过乳液云室(ECC)砖的情况,启动电子诱导的电磁淋浴,这可以是长曝光时间或大输入粒子通量的情况。例如,船舶实验计划使用乳液检测器进行暗物质搜索和中微子物理调查。船舶实验的预期完整通量约为10 ^ 20颗粒。为了降低与替换ECC砖和离线数据的实验的成本(乳液扫描),决定增加暴露时间。因此,我们希望观察大量重叠阵雨,将EM淋浴重建变为挑战的点云分割问题。我们的重建管线包括图形神经网络,其预测邻接矩阵和聚类算法。我们提出了一种新的层型(乳液CONV),其考虑了ECC砖中淋浴开发的几何特性。对于重叠阵雨的聚类,我们使用修改后的基于分层密度的聚类算法。我们的方法不使用有关进入粒子的任何先前信息,并识别乳液检测器中的高达87%的电磁淋浴。用于重建电磁淋浴的算法的主要测试台将是SND @ LHC。
translated by 谷歌翻译
Pre-publication draft of a book to be published byMorgan & Claypool publishers. Unedited version released with permission. All relevant copyrights held by the author and publisher extend to this pre-publication draft.
translated by 谷歌翻译
Graph classification is an important area in both modern research and industry. Multiple applications, especially in chemistry and novel drug discovery, encourage rapid development of machine learning models in this area. To keep up with the pace of new research, proper experimental design, fair evaluation, and independent benchmarks are essential. Design of strong baselines is an indispensable element of such works. In this thesis, we explore multiple approaches to graph classification. We focus on Graph Neural Networks (GNNs), which emerged as a de facto standard deep learning technique for graph representation learning. Classical approaches, such as graph descriptors and molecular fingerprints, are also addressed. We design fair evaluation experimental protocol and choose proper datasets collection. This allows us to perform numerous experiments and rigorously analyze modern approaches. We arrive to many conclusions, which shed new light on performance and quality of novel algorithms. We investigate application of Jumping Knowledge GNN architecture to graph classification, which proves to be an efficient tool for improving base graph neural network architectures. Multiple improvements to baseline models are also proposed and experimentally verified, which constitutes an important contribution to the field of fair model comparison.
translated by 谷歌翻译
近年来,基于Weisfeiler-Leman算法的算法和神经架构,是一个众所周知的Graph同构问题的启发式问题,它成为具有图形和关系数据的机器学习的强大工具。在这里,我们全面概述了机器学习设置中的算法的使用,专注于监督的制度。我们讨论了理论背景,展示了如何将其用于监督的图形和节点表示学习,讨论最近的扩展,并概述算法的连接(置换 - )方面的神经结构。此外,我们概述了当前的应用和未来方向,以刺激进一步的研究。
translated by 谷歌翻译
Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics. The interplay of components can give rise to complex behavior, which can often be explained using a simple model of the system's constituent parts. In this work, we introduce the neural relational inference (NRI) model: an unsupervised model that learns to infer interactions while simultaneously learning the dynamics purely from observational data. Our model takes the form of a variational auto-encoder, in which the latent code represents the underlying interaction graph and the reconstruction is based on graph neural networks. In experiments on simulated physical systems, we show that our NRI model can accurately recover ground-truth interactions in an unsupervised manner. We further demonstrate that we can find an interpretable structure and predict complex dynamics in real motion capture and sports tracking data.
translated by 谷歌翻译
图表神经网络(GNNS)最近在人工智能(AI)领域的普及,这是由于它们作为输入数据相对非结构化数据类型的独特能力。尽管GNN架构的一些元素在概念上类似于传统神经网络(以及神经网络变体)的操作中,但是其他元件代表了传统深度学习技术的偏离。本教程通过整理和呈现有关GNN最常见和性能变种的动机,概念,数学和应用的细节,将GNN的权力和新颖性暴露给AI从业者。重要的是,我们简明扼要地向实际示例提出了本教程,从而为GNN的主题提供了实用和可访问的教程。
translated by 谷歌翻译
组合优化是运营研究和计算机科学领域的一个公认领域。直到最近,它的方法一直集中在孤立地解决问题实例,而忽略了它们通常源于实践中的相关数据分布。但是,近年来,人们对使用机器学习,尤其是图形神经网络(GNN)的兴趣激增,作为组合任务的关键构件,直接作为求解器或通过增强确切的求解器。GNN的电感偏差有效地编码了组合和关系输入,因为它们对排列和对输入稀疏性的意识的不变性。本文介绍了对这个新兴领域的最新主要进步的概念回顾,旨在优化和机器学习研究人员。
translated by 谷歌翻译
在处理表格数据时,基于回归和决策树的模型是一个流行的选择,因为与其他模型类别相比,它们在此类任务上提供了高精度及其易于应用。但是,在图形结构数据方面,当前的树学习算法不提供管理数据结构的工具,而不是依靠功能工程。在这项工作中,我们解决了上述差距,并引入了图形树(GTA),这是一个新的基于树的学习算法,旨在在图形上操作。 GTA既利用图形结构又利用了顶点的特征,并采用了一种注意机制,该机制允许决策专注于图形的子结构。我们分析了GTA模型,并表明它们比平原决策树更具表现力。我们还在多个图和节点预测基准上证明了GTA的好处。在这些实验中,GTA始终优于其他基于树的模型,并且通常优于其他类型的图形学习算法,例如图形神经网络(GNNS)和图核。最后,我们还为GTA提供了一种解释性机制,并证明它可以提供直观的解释。
translated by 谷歌翻译
In recent years, graph neural networks (GNNs) have emerged as a promising tool for solving machine learning problems on graphs. Most GNNs are members of the family of message passing neural networks (MPNNs). There is a close connection between these models and the Weisfeiler-Leman (WL) test of isomorphism, an algorithm that can successfully test isomorphism for a broad class of graphs. Recently, much research has focused on measuring the expressive power of GNNs. For instance, it has been shown that standard MPNNs are at most as powerful as WL in terms of distinguishing non-isomorphic graphs. However, these studies have largely ignored the distances between the representations of nodes/graphs which are of paramount importance for learning tasks. In this paper, we define a distance function between nodes which is based on the hierarchy produced by the WL algorithm, and propose a model that learns representations which preserve those distances between nodes. Since the emerging hierarchy corresponds to a tree, to learn these representations, we capitalize on recent advances in the field of hyperbolic neural networks. We empirically evaluate the proposed model on standard node and graph classification datasets where it achieves competitive performance with state-of-the-art models.
translated by 谷歌翻译
在2015年和2019年之间,地平线的成员2020年资助的创新培训网络名为“Amva4newphysics”,研究了高能量物理问题的先进多变量分析方法和统计学习工具的定制和应用,并开发了完全新的。其中许多方法已成功地用于提高Cern大型Hadron撞机的地图集和CMS实验所执行的数据分析的敏感性;其他几个人,仍然在测试阶段,承诺进一步提高基本物理参数测量的精确度以及新现象的搜索范围。在本文中,在研究和开发的那些中,最相关的新工具以及对其性能的评估。
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
机器人中的一个重要挑战是了解机器人与由粒状材料组成的可变形地形之间的相互作用。颗粒状流量及其与刚体的互动仍然造成了几个开放的问题。有希望的方向,用于准确,且有效的建模使用的是使用连续体方法。此外,实时物理建模的新方向是利用深度学习。该研究推进了用于对刚性体驱动颗粒流建模的机器学习方法,用于应用于地面工业机器以及空间机器人(重力的效果是一个重要因素的地方)。特别是,该研究考虑了子空间机器学习仿真方法的开发。要生成培训数据集,我们利用我们的高保真连续体方法,材料点法(MPM)。主要成分分析(PCA)用于降低数据的维度。我们表明我们的高维数据的前几个主要组成部分几乎保持了数据的整个方差。培训图形网络模拟器(GNS)以学习底层子空间动态。然后,学习的GNS能够以良好的准确度预测颗粒位置和交互力。更重要的是,PCA在训练和卷展栏中显着提高了GNS的时间和记忆效率。这使得GNS能够使用具有中等VRAM的单个桌面GPU进行培训。这也使GNS实时在大规模3D物理配置(比我们的连续方法快700倍)。
translated by 谷歌翻译
了解晕星连接是基本的,以提高我们对暗物质的性质和性质的知识。在这项工作中,我们构建一个模型,鉴于IT主机的星系的位置,速度,恒星群体和半径的位置。为了捕获来自星系属性的相关性及其相位空间的相关信息,我们使用图形神经网络(GNN),该网络设计用于使用不规则和稀疏数据。我们从宇宙学和天体物理学中培训了我们在Galaxies上的模型,从宇宙学和天体物理学与机器学习模拟(骆驼)项目。我们的模型,占宇宙学和天体物理的不确定性,能够用$ \ SIM 0.2欧元的准确度来限制晕群。此外,在一套模拟上培训的GNN能够在用利用不同的代码的模拟上进行测试时保留其精度的一部分精度。 GNN的Pytorch几何实现在HTTPS://github.com/pablovd/halographnet上公开可用于github上
translated by 谷歌翻译
我们提出了一个新的图形神经网络,我们称为AgentNet,该网络专为图形级任务而设计。 AgentNet的灵感来自子宫性算法,具有独立于图形大小的计算复杂性。代理Net的体系结构从根本上与已知图神经网络的体系结构不同。在AgentNet中,一些受过训练的\ textit {神经代理}智能地行走图,然后共同决定输出。我们提供了对AgentNet的广泛理论分析:我们表明,代理可以学会系统地探索其邻居,并且AgentNet可以区分某些甚至3-WL无法区分的结构。此外,AgentNet能够将任何两个图形分开,这些图在子图方面完全不同。我们通过在难以辨认的图和现实图形分类任务上进行合成实验来确认这些理论结果。在这两种情况下,我们不仅与标准GNN相比,而且与计算更昂贵的GNN扩展相比。
translated by 谷歌翻译
Here we present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains, involving fluids, rigid solids, and deformable materials interacting with one another. Our framework-which we term "Graph Network-based Simulators" (GNS)-represents the state of a physical system with particles, expressed as nodes in a graph, and computes dynamics via learned message-passing. Our results show that our model can generalize from single-timestep predictions with thousands of particles during training, to different initial conditions, thousands of timesteps, and at least an order of magnitude more particles at test time. Our model was robust to hyperparameter choices across various evaluation metrics: the main determinants of long-term performance were the number of message-passing steps, and mitigating the accumulation of error by corrupting the training data with noise. Our GNS framework advances the state-of-the-art in learned physical simulation, and holds promise for solving a wide range of complex forward and inverse problems.
translated by 谷歌翻译
Recently, graph neural networks (GNNs) have revolutionized the field of graph representation learning through effectively learned node embeddings, and achieved state-of-the-art results in tasks such as node classification and link prediction. However, current GNN methods are inherently flat and do not learn hierarchical representations of graphs-a limitation that is especially problematic for the task of graph classification, where the goal is to predict the label associated with an entire graph. Here we propose DIFFPOOL, a differentiable graph pooling module that can generate hierarchical representations of graphs and can be combined with various graph neural network architectures in an end-to-end fashion. DIFFPOOL learns a differentiable soft cluster assignment for nodes at each layer of a deep GNN, mapping nodes to a set of clusters, which then form the coarsened input for the next GNN layer. Our experimental results show that combining existing GNN methods with DIFFPOOL yields an average improvement of 5-10% accuracy on graph classification benchmarks, compared to all existing pooling approaches, achieving a new state-of-the-art on four out of five benchmark data sets.
translated by 谷歌翻译