我们介绍了一种新颖的谐波分析,用于在函数上定义的函数,随机步行操作员是基石。作为第一步,我们将随机步行操作员的一组特征向量作为非正交傅里叶类型的功能,用于通过定向图。我们通过将从其Dirichlet能量获得的随机步行操作员的特征向量的变化与其相关的特征值的真实部分连接来发现频率解释。从这个傅立叶基础,我们可以进一步继续,并在有向图中建立多尺度分析。通过将Coifman和MagGioni扩展到定向图,我们提出了一种冗余小波变换和抽取的小波变换。因此,我们对导向图的谐波分析的发展导致我们考虑应用于突出了我们框架效率的指示图的图形上的半监督学习问题和信号建模问题。
translated by 谷歌翻译
In applications such as social, energy, transportation, sensor, and neuronal networks, high-dimensional data naturally reside on the vertices of weighted graphs. The emerging field of signal processing on graphs merges algebraic and spectral graph theoretic concepts with computational harmonic analysis to process such signals on graphs. In this tutorial overview, we outline the main challenges of the area, discuss different ways to define graph spectral domains, which are the analogues to the classical frequency domain, and highlight the importance of incorporating the irregular structures of graph data domains when processing signals on graphs. We then review methods to generalize fundamental operations such as filtering, translation, modulation, dilation, and downsampling to the graph setting, and survey the localized, multiscale transforms that have been proposed to efficiently extract information from high-dimensional data on graphs. We conclude with a brief discussion of open issues and possible extensions.
translated by 谷歌翻译
Research in Graph Signal Processing (GSP) aims to develop tools for processing data defined on irregular graph domains. In this paper we first provide an overview of core ideas in GSP and their connection to conventional digital signal processing, along with a brief historical perspective to highlight how concepts recently developed in GSP build on top of prior research in other areas. We then summarize recent advances in developing basic GSP tools, including methods for sampling, filtering or graph learning. Next, we review progress in several application areas using GSP, including processing and analysis of sensor network data, biological data, and applications to image processing and machine learning.
translated by 谷歌翻译
马尔可夫链是一类概率模型,在定量科学中已广泛应用。这部分是由于它们的多功能性,但是可以通过分析探测的便利性使其更加复杂。本教程为马尔可夫连锁店提供了深入的介绍,并探索了它们与图形和随机步行的联系。我们利用从线性代数和图形论的工具来描述不同类型的马尔可夫链的过渡矩阵,特别着眼于探索与这些矩阵相对应的特征值和特征向量的属性。提出的结果与机器学习和数据挖掘中的许多方法有关,我们在各个阶段描述了这些方法。本文并没有本身就成为一项新颖的学术研究,而是提出了一些已知结果的集合以及一些新概念。此外,该教程的重点是向读者提供直觉,而不是正式的理解,并且仅假定对线性代数和概率理论的概念的基本曝光。因此,来自各种学科的学生和研究人员可以访问它。
translated by 谷歌翻译
图表表示学习有许多现实世界应用,从超级分辨率的成像,3D计算机视觉到药物重新扫描,蛋白质分类,社会网络分析。图表数据的足够表示对于图形结构数据的统计或机器学习模型的学习性能至关重要。在本文中,我们提出了一种用于图形数据的新型多尺度表示系统,称为抽取帧的图形数据,其在图表上形成了本地化的紧密框架。抽取的帧系统允许在粗粒链上存储图形数据表示,并在每个比例的多个尺度处处理图形数据,数据存储在子图中。基于此,我们通过建设性数据驱动滤波器组建立用于在多分辨率下分解和重建图数据的抽取G-Framewelet变换。图形帧构建基于基于链的正交基础,支持快速图傅里叶变换。由此,我们为抽取的G-Frameword变换或FGT提供了一种快速算法,该算法具有线性计算复杂度O(n),用于尺寸N的图表。用数值示例验证抽取的帧谱和FGT的理论,用于随机图形。现实世界应用的效果是展示的,包括用于交通网络的多分辨率分析,以及图形分类任务的图形神经网络。
translated by 谷歌翻译
We propose a novel method for constructing wavelet transforms of functions defined on the vertices of an arbitrary finite weighted graph. Our approach is based on defining scaling using the the graph analogue of the Fourier domain, namely the spectral decomposition of the discrete graph Laplacian L. Given a wavelet generating kernel g and a scale parameter t, we define the scaled wavelet operator T t g = g(tL). The spectral graph wavelets are then formed by localizing this operator by applying it to an indicator function. Subject to an admissibility condition on g, this procedure defines an invertible transform. We explore the localization properties of the wavelets in the limit of fine scales. Additionally, we present a fast Chebyshev polynomial approximation algorithm for computing the transform that avoids the need for diagonalizing L. We highlight potential applications of the transform through examples of wavelets on graphs corresponding to a variety of different problem domains.
translated by 谷歌翻译
In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.
translated by 谷歌翻译
Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them.Geometric deep learning is an umbrella term for emerging techniques attempting to generalize (structured) deep neural models to non-Euclidean domains such as graphs and manifolds. The purpose of this paper is to overview different examples of geometric deep learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field.
translated by 谷歌翻译
散射变换是一种基于小波的多层转换,最初是作为卷积神经网络(CNN)的模型引入的,它在我们对这些网络稳定性和不变性属性的理解中发挥了基础作用。随后,人们普遍兴趣将CNN的成功扩展到具有非欧盟结构的数据集,例如图形和歧管,从而导致了几何深度学习的新兴领域。为了提高我们对这个新领域中使用的体系结构的理解,几篇论文提出了对非欧几里得数据结构(如无方向的图形和紧凑的Riemannian歧管)的散射转换的概括。在本文中,我们介绍了一个通用的统一模型,用于测量空间上的几何散射。我们提出的框架包括以前的几何散射作品作为特殊情况,但也适用于更通用的设置,例如有向图,签名图和带边界的歧管。我们提出了一个新标准,该标准可以识别哪些有用表示应该不变的组,并表明该标准足以确保散射变换具有理想的稳定性和不变性属性。此外,我们考虑从随机采样未知歧管获得的有限度量空间。我们提出了两种构造数据驱动图的方法,在该图上相关的图形散射转换近似于基础歧管上的散射变换。此外,我们使用基于扩散图的方法来证明这些近似值之一的收敛速率的定量估计值,因为样品点的数量趋向于无穷大。最后,我们在球形图像,有向图和高维单细胞数据上展示了方法的实用性。
translated by 谷歌翻译
Pre-publication draft of a book to be published byMorgan & Claypool publishers. Unedited version released with permission. All relevant copyrights held by the author and publisher extend to this pre-publication draft.
translated by 谷歌翻译
Graph clustering is a fundamental problem in unsupervised learning, with numerous applications in computer science and in analysing real-world data. In many real-world applications, we find that the clusters have a significant high-level structure. This is often overlooked in the design and analysis of graph clustering algorithms which make strong simplifying assumptions about the structure of the graph. This thesis addresses the natural question of whether the structure of clusters can be learned efficiently and describes four new algorithmic results for learning such structure in graphs and hypergraphs. All of the presented theoretical results are extensively evaluated on both synthetic and real-word datasets of different domains, including image classification and segmentation, migration networks, co-authorship networks, and natural language processing. These experimental results demonstrate that the newly developed algorithms are practical, effective, and immediately applicable for learning the structure of clusters in real-world data.
translated by 谷歌翻译
图形卷积网络(GCN)已被证明是一个有力的概念,在过去几年中,已成功应用于许多领域的各种任务。在这项工作中,我们研究了为GCN定义铺平道路的理论,包括经典图理论的相关部分。我们还讨论并在实验上证明了GCN的关键特性和局限性,例如由样品的统计依赖性引起的,该图由图的边缘引入,这会导致完整梯度的估计值偏置。我们讨论的另一个限制是Minibatch采样对模型性能的负面影响。结果,在参数更新期间,在整个数据集上计算梯度,从而破坏了对大图的可扩展性。为了解决这个问题,我们研究了替代方法,这些方法允许在每次迭代中仅采样一部分数据,可以安全地学习良好的参数。我们重现了KIPF等人的工作中报告的结果。并提出一个灵感签名的实现,这是一种无抽样的minibatch方法。最终,我们比较了基准数据集上的两个实现,证明它们在半监督节点分类任务的预测准确性方面是可比的。
translated by 谷歌翻译
在本文中,我们考虑了一个$ {\ rm u}(1)$ - 连接图,也就是说,每个方向的边缘都赋予了一个单位模量复杂的数字,该数字在方向翻转下简单地结合了。当时,组合laplacian的自然替代品是所谓的磁性拉普拉斯(Hermitian Matrix),其中包括有关图形连接的信息。连接图和磁性拉普拉斯人出现,例如在角度同步问题中。在较大且密集的图的背景下,我们在这里研究了磁性拉普拉斯的稀疏器,即基于边缘很少的子图的光谱近似值。我们的方法依赖于使用自定义的确定点过程对跨越森林(MTSF)进行取样,这是一种比偏爱多样性的边缘的分布。总而言之,MTSF是一个跨越子图,其连接的组件是树或周期根的树。后者部分捕获了连接图的角不一致,因此提供了一种压缩连接中包含的信息的方法。有趣的是,当此连接图具有弱不一致的周期时,可以通过使用循环弹出的随机行走来获得此分布的样本。我们为选择Laplacian的自然估计量提供了统计保证,并调查了我们的Sparsifier在两个应用中的实际应用。
translated by 谷歌翻译
Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency -- given $n$ input points, most kernel-based algorithms need to materialize the full $n \times n$ kernel matrix before performing any subsequent computation, thus incurring $\Omega(n^2)$ runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain $\textit{subquadratic}$ time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving linear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recent Kernel Density Estimation framework, which (after preprocessing in time subquadratic in $n$) can return estimates of row/column sums of the kernel matrix. In particular, we develop efficient reductions from $\textit{weighted vertex}$ and $\textit{weighted edge sampling}$ on kernel graphs, $\textit{simulating random walks}$ on kernel graphs, and $\textit{importance sampling}$ on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in $\textit{sublinear}$ (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsification, where we observe a $\textbf{9x}$ decrease in the number of kernel evaluations over baselines for LRA and a $\textbf{41x}$ reduction in the graph size for spectral sparsification.
translated by 谷歌翻译
散射变换是一种基于多层的小波的深度学习架构,其充当卷积神经网络的模型。最近,几种作品引入了非欧几里德设置的散射变换的概括,例如图形。我们的工作通过基于非常一般的非对称小波来引入图形的窗口和非窗口几何散射变换来构建这些结构。我们表明,这些不对称的图形散射变换具有许多与其对称对应的相同的理论保证。结果,所提出的结构统一并扩展了许多现有图散射架构的已知理论结果。在这样做时,这项工作有助于通过引入具有可提供稳定性和不变性保证的大型网络,帮助弥合几何散射和其他图形神经网络之间的差距。这些结果为未来的图形结构数据奠定了基础,对具有学习过滤器的图形结构数据,并且还可以证明具有理想的理论特性。
translated by 谷歌翻译
We consider the nonlinear inverse problem of learning a transition operator $\mathbf{A}$ from partial observations at different times, in particular from sparse observations of entries of its powers $\mathbf{A},\mathbf{A}^2,\cdots,\mathbf{A}^{T}$. This Spatio-Temporal Transition Operator Recovery problem is motivated by the recent interest in learning time-varying graph signals that are driven by graph operators depending on the underlying graph topology. We address the nonlinearity of the problem by embedding it into a higher-dimensional space of suitable block-Hankel matrices, where it becomes a low-rank matrix completion problem, even if $\mathbf{A}$ is of full rank. For both a uniform and an adaptive random space-time sampling model, we quantify the recoverability of the transition operator via suitable measures of incoherence of these block-Hankel embedding matrices. For graph transition operators these measures of incoherence depend on the interplay between the dynamics and the graph topology. We develop a suitable non-convex iterative reweighted least squares (IRLS) algorithm, establish its quadratic local convergence, and show that, in optimal scenarios, no more than $\mathcal{O}(rn \log(nT))$ space-time samples are sufficient to ensure accurate recovery of a rank-$r$ operator $\mathbf{A}$ of size $n \times n$. This establishes that spatial samples can be substituted by a comparable number of space-time samples. We provide an efficient implementation of the proposed IRLS algorithm with space complexity of order $O(r n T)$ and per-iteration time complexity linear in $n$. Numerical experiments for transition operators based on several graph models confirm that the theoretical findings accurately track empirical phase transitions, and illustrate the applicability and scalability of the proposed algorithm.
translated by 谷歌翻译
图形信号处理(GSP)中的基本前提是,将目标信号的成对(反)相关性作为边缘权重以用于图形过滤。但是,现有的快速图抽样方案仅针对描述正相关的正图设计和测试。在本文中,我们表明,对于具有强固有抗相关的数据集,合适的图既包含正边缘和负边缘。作为响应,我们提出了一种以平衡签名图的概念为中心的线性时间签名的图形采样方法。具体而言,给定的经验协方差数据矩阵$ \ bar {\ bf {c}} $,我们首先学习一个稀疏的逆矩阵(Graph laplacian)$ \ MATHCAL {l} $对应于签名图$ \ Mathcal $ \ Mathcal {G} $ 。我们为平衡签名的图形$ \ Mathcal {g} _b $ - 近似$ \ Mathcal {g} $通过Edge Exge Exgement Exgmentation -As Graph频率组件定义Laplacian $ \ Mathcal {L} _b $的特征向量。接下来,我们选择样品以将低通滤波器重建误差分为两个步骤最小化。我们首先将Laplacian $ \ Mathcal {L} _b $的所有Gershgorin圆盘左端对齐,最小的EigenValue $ \ lambda _ {\ min}(\ Mathcal {l} _b)$通过相似性转换$ \ MATHCAL $ \ MATHCAL} s \ Mathcal {l} _b \ s^{ - 1} $,利用最新的线性代数定理,称为gershgorin disc perfect perfect对齐(GDPA)。然后,我们使用以前的快速gershgorin盘式对齐采样(GDAS)方案对$ \ Mathcal {L} _p $进行采样。实验结果表明,我们签名的图形采样方法在各种数据集上明显优于现有的快速采样方案。
translated by 谷歌翻译
我们研究了以模型为简单络合物的抽象拓扑空间支撑的处理信号的线性过滤器,可以解释为解释节点,边缘,三角形面的图形的概括等,以处理此类信号,我们开发了定义为Matrix polynomials的简单卷积过滤器下霍德·拉普拉斯人的下部和上部。首先,我们研究了这些过滤器的特性,并表明它们是线性和转移不变的,以及置换和定向等效的。这些过滤器也可以以低计算复杂性的分布式方式实现,因为它们仅涉及(多个回合)上层和下相邻简单之间的简单转移。其次,着眼于边缘流,我们研究了这些过滤器的频率响应,并研究了如何使用Hodge分类来描述梯度,卷曲和谐波频率。我们讨论了这些频率如何对应于霍德拉普拉斯(Hodge laplacian)的下部和上等耦合以及上的核心,并且可以通过我们的滤波器设计独立调整。第三,我们研究设计简单卷积过滤器并讨论其相对优势的不同程序。最后,我们在几种应用中证实了简单过滤器:提取简单信号的不同频率组件,以denoise边缘流量以及分析金融市场和交通网络。
translated by 谷歌翻译
我们研究了p-laplacians和光谱聚类,以融合了边缘依赖性顶点权重(EDVW)的最近提出的超图模型。这些权重可以反映在超边缘内顶点的不同重要性,从而赋予超图模型更高的表达性和灵活性。通过构建基于EDVWS的基于EDVWS的分裂函数,我们将具有EDVW的超图转换为频谱理论更好地开发的谱图。这样,现有的概念和定理,例如P-Laplacians和Subsodular HyperGraph设置下提出的P-Laplacians和Cheeger不平等现象,可以直接扩展到具有EDVW的超图。对于具有基于EDVWS的拆分功能的子管道超图,我们提出了一种有效的算法来计算与1-Laplacian的第二小特征值相关的特征向量。然后,我们利用此特征向量来聚类顶点,比基于2-Laplacian的传统光谱聚类获得更高的聚类精度。从更广泛的角度来看,所提出的算法适用于所有可降低图的亚物种超图。使用现实世界数据的数值实验证明了基于1-Laplacian和EDVW的光谱聚类的有效性。
translated by 谷歌翻译
作为建模复杂关系的强大工具,HyperGraphs从图表学习社区中获得了流行。但是,深度刻画学习中的常用框架专注于具有边缘独立的顶点权重(EIVW)的超图,而无需考虑具有具有更多建模功率的边缘依赖性顶点权重(EDVWS)的超图。为了弥补这一点,我们提出了一般的超图光谱卷积(GHSC),这是一个通用学习框架,不仅可以处理EDVW和EIVW HyperGraphs,而且更重要的是,理论上可以明确地利用现有强大的图形卷积神经网络(GCNN)明确说明,从而很大程度上可以释放。超图神经网络的设计。在此框架中,给定的无向GCNN的图形拉普拉斯被统一的HyperGraph Laplacian替换,该统一的HyperGraph Laplacian通过将我们所定义的广义超透明牌与简单的无向图等同起来,从随机的步行角度将顶点权重信息替换。来自各个领域的广泛实验,包括社交网络分析,视觉目标分类和蛋白质学习,证明了拟议框架的最新性能。
translated by 谷歌翻译