Convolutional Neural Networks (CNNs) achieve impressive performance in a wide variety of fields. Their success benefited from a massive boost when very deep CNN models were able to be reliably trained. Despite their merits, CNNs fail to properly address problems with non-Euclidean data. To overcome this challenge, Graph Convolutional Networks (GCNs) build graphs to represent non-Euclidean data, borrow concepts from CNNs, and apply them in training. GCNs show promising results, but they are usually limited to very shallow models due to the vanishing gradient problem (see Figure 1). As a result, most state-of-the-art GCN models are no deeper than 3 or 4 layers. In this work, we present new ways to successfully train very deep GCNs. We do this by borrowing concepts from CNNs, specifically residual/dense connections and dilated convolutions, and adapting them to GCN architectures. Extensive experiments show the positive effect of these deep GCN frameworks. Finally, we use these new concepts to build a very deep 56-layer GCN, and show how it significantly boosts performance (+3.7% mIoU over state-of-the-art) in the task of point cloud semantic segmentation. We believe that the community can greatly benefit from this work, as it opens up many opportunities for advancing GCN-based research.
translated by 谷歌翻译
3D点云的卷积经过广泛研究,但在几何深度学习中却远非完美。卷积的传统智慧在3D点之间表现出特征对应关系,这是对差的独特特征学习的内在限制。在本文中,我们提出了自适应图卷积(AGCONV),以供点云分析的广泛应用。 AGCONV根据其动态学习的功能生成自适应核。与使用固定/各向同性核的解决方案相比,AGCONV提高了点云卷积的灵活性,有效,精确地捕获了不同语义部位的点之间的不同关系。与流行的注意力体重方案不同,AGCONV实现了卷积操作内部的适应性,而不是简单地将不同的权重分配给相邻点。广泛的评估清楚地表明,我们的方法优于各种基准数据集中的点云分类和分割的最新方法。同时,AGCONV可以灵活地采用更多的点云分析方法来提高其性能。为了验证其灵活性和有效性,我们探索了基于AGCONV的完成,DeNoing,Upsmpling,注册和圆圈提取的范式,它们与竞争对手相当甚至优越。我们的代码可在https://github.com/hrzhou2/adaptconv-master上找到。
translated by 谷歌翻译
点云的语义分割通过密集预测每个点的类别来产生对场景的全面理解。由于接收场的一致性,点云的语义分割对于多受感受性场特征的表达仍然具有挑战性,这会导致对具有相似空间结构的实例的错误分类。在本文中,我们提出了一个植根于扩张图特征聚集(DGFA)的图形卷积网络DGFA-NET,该图由通过金字塔解码器计算出的多基质聚集损失(Maloss)引导。为了配置多受感受性字段特征,将建议的扩张图卷积(DGCONV)作为其基本构建块,旨在通过捕获带有各种接收区域的扩张图来汇总多尺度特征表示。通过同时考虑用不同分辨率的点集作为计算碱基的点集惩罚接收场信息,我们引入了由Maloss驱动的金字塔解码器,以了解接受田间的多样性。结合这两个方面,DGFA-NET显着提高了具有相似空间结构的实例的分割性能。 S3DIS,ShapenetPart和Toronto-3D的实验表明,DGFA-NET优于基线方法,实现了新的最新细分性能。
translated by 谷歌翻译
由于其高质量的对象表示和有效的获取方法,3D点云吸引了越来越多的架构,工程和构建的关注。因此,文献中已经提出了许多点云特征检测方法来自动化一些工作流,例如它们的分类或部分分割。然而,点云自动化系统的性能显着落后于图像对应物。尽管这种故障的一部分源于云云的不规则性,非结构性和混乱,这使得云特征检测的任务比图像一项更具挑战性,但我们认为,图像域缺乏灵感可能是主要的。这种差距的原因。确实,鉴于图像特征检测中卷积神经网络(CNN)的压倒性成功,设计其点云对应物似乎是合理的,但是所提出的方法都不类似于它们。具体而言,即使许多方法概括了点云中的卷积操作,但它们也无法模仿CNN的多种功能检测和汇总操作。因此,我们提出了一个基于图卷积的单元,称为收缩单元,可以垂直和水平堆叠,以设计类似CNN的3D点云提取器。鉴于点云中点之间的自我,局部和全局相关性传达了至关重要的空间几何信息,因此我们在特征提取过程中还利用它们。我们通过为ModelNet-10基准数据集设计功能提取器模型来评估我们的建议,并达到90.64%的分类精度,表明我们的创新想法是有效的。我们的代码可在github.com/albertotamajo/shrinking-unit上获得。
translated by 谷歌翻译
机载激光扫描(ALS)点云的分类是遥感和摄影测量场的关键任务。尽管最近基于深度学习的方法取得了令人满意的表现,但他们忽略了接受场的统一性,这使得ALS点云分类对于区分具有复杂结构和极端规模变化的区域仍然具有挑战性。在本文中,为了配置多受感受性的场特征,我们提出了一个新型的接受场融合和分层网络(RFFS-NET)。以新颖的扩张图卷积(DGCONV)及其扩展环形扩张卷积(ADCONV)作为基本的构建块,使用扩张和环形图融合(Dagfusion)模块实现了接受场融合过程,该模块获得了多受感染的场特征代表通过捕获带有各种接收区域的扩张和环形图。随着计算碱基的计算基础,使用嵌套在RFFS-NET中的多级解码器进行的接收场的分层,并由多层接受场聚集损失(MRFALOSS)驱动,以驱动网络驱动网络以学习在具有不同分辨率的监督标签的方向。通过接受场融合和分层,RFFS-NET更适应大型ALS点云中具有复杂结构和极端尺度变化区域的分类。在ISPRS Vaihingen 3D数据集上进行了评估,我们的RFFS-NET显着优于MF1的基线方法5.3%,而MIOU的基线方法的总体准确性为82.1%,MF1的总准确度为71.6%,MIOU的MF1和MIOU为58.2%。此外,LASDU数据集和2019 IEEE-GRSS数据融合竞赛数据集的实验显示,RFFS-NET可以实现新的最新分类性能。
translated by 谷歌翻译
A number of problems can be formulated as prediction on graph-structured data. In this work, we generalize the convolution operator from regular grids to arbitrary graphs while avoiding the spectral domain, which allows us to handle graphs of varying size and connectivity. To move beyond a simple diffusion, filter weights are conditioned on the specific edge labels in the neighborhood of a vertex. Together with the proper choice of graph coarsening, we explore constructing deep neural networks for graph classification. In particular, we demonstrate the generality of our formulation in point cloud classification, where we set the new state of the art, and on a graph classification dataset, where we outperform other deep learning approaches. The source code is available at https://github.com/mys007/ecc.
translated by 谷歌翻译
Standard convolution is inherently limited for semantic segmentation of point cloud due to its isotropy about features. It neglects the structure of an object, results in poor object delineation and small spurious regions in the segmentation result. This paper proposes a novel graph attention convolution (GAC), whose kernels can be dynamically carved into specific shapes to adapt to the structure of an object. Specifically, by assigning proper attentional weights to different neighboring points, GAC is designed to selectively focus on the most relevant part of them according to their dynamically learned features. The shape of the convolution kernel is then determined by the learned distribution of the attentional weights. Though simple, GAC can capture the structured features of point clouds for finegrained segmentation and avoid feature contamination between objects. Theoretically, we provided a thorough analysis on the expressive capabilities of GAC to show how it can learn about the features of point clouds. Empirically, we evaluated the proposed GAC on challenging indoor and outdoor datasets and achieved the state-of-the-art results in both scenarios.
translated by 谷歌翻译
我们提出CPT:卷积点变压器 - 一种用于处理3D点云数据的非结构化性质的新型深度学习架构。 CPT是对现有关注的卷曲神经网络以及以前的3D点云处理变压器的改进。由于其在创建基于新颖的基于注意力的点集合嵌入通过制作用于处理动态局部点设定的邻域的卷积投影层的嵌入来实现这一壮举。结果点设置嵌入对输入点的排列是强大的。我们的小说CPT块在网络结构中通过动态图计算获得的本地邻居构建。它是完全可差异的,可以像卷积层一样堆叠,以学习点的全局属性。我们评估我们的模型在ModelNet40,ShapEnet​​部分分割和S3DIS 3D室内场景语义分割数据集等标准基准数据集上,以显示我们的模型可以用作各种点云处理任务的有效骨干,与现有状态相比 - 艺术方法。
translated by 谷歌翻译
3D网格的几何特征学习是计算机图形的核心,对于许多视觉应用非常重要。然而,由于缺乏所需的操作和/或其有效的实现,深度学习目前滞后于异构3D网格的层次建模。在本文中,我们提出了一系列模块化操作,以实现异构3D网格的有效几何深度学习。这些操作包括网格卷曲,(UN)池和高效的网格抽取。我们提供这些操作的开源实施,统称为\ Texit {Picasso}。 Picasso的网格抽取模块是GPU加速的模块,可以在飞行中加工一批用于深度学习的网格。我们(联合国)汇集操作在不同分辨率的网络层跨网络层计算新创建的神经元的功能。我们的网格卷曲包括FaceT2Vertex,Vertex2Facet和FaceT2Facet卷积,用于利用VMF混合物和重心插值来包含模糊建模。利用Picasso的模块化操作,我们贡献了一个新型的分层神经网络Picassonet-II,以了解3D网格的高度辨别特征。 Picassonet-II接受原始地理学和Mesh Facet的精细纹理作为输入功能,同时处理完整场景网格。我们的网络达到了各种基准的形状分析和场景的竞争性能。我们在github https://github.com/enyahermite/picasso发布Picasso和Picassonet-II。
translated by 谷歌翻译
学习地区内部背景和区域间关系是加强点云分析的特征表示的两项有效策略。但是,在现有方法中没有完全强调的统一点云表示的两种策略。为此,我们提出了一种名为点关系感知网络(PRA-NET)的小说框架,其由区域内结构学习(ISL)模块和区域间关系学习(IRL)模块组成。ISL模块可以通过可差的区域分区方案和基于代表的基于点的策略自适应和有效地将本地结构信息动态地集成到点特征中,而IRL模块可自适应和有效地捕获区域间关系。在涵盖形状分类,关键点估计和部分分割的几个3D基准测试中的广泛实验已经验证了PRA-Net的有效性和泛化能力。代码将在https://github.com/xiwuchen/pra-net上获得。
translated by 谷歌翻译
由于缺乏连接性信息,对局部表面几何形状进行建模在3D点云的理解中具有挑战性。大多数先前的作品使用各种卷积操作模拟本地几何形状。我们观察到,卷积可以等效地分解为局部和全球成分的加权组合。通过这种观察,我们明确地将这两个组件解散了,以便可以增强局部的组件并促进局部表面几何形状的学习。具体而言,我们提出了Laplacian单元(LU),这是一个简单而有效的建筑单元,可以增强局部几何学的学习。广泛的实验表明,配备有LU的网络在典型的云理解任务上实现了竞争性或卓越的性能。此外,通过建立平均曲率流之间的连接,基于曲率的LU进行了进一步研究,以解释LU的自适应平滑和锐化效果。代码将可用。
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
标准空间卷积假设具有常规邻域结构的输入数据。现有方法通常通过修复常规“视图”来概括对不规则点云域的卷积。固定的邻域大小,卷积内核大小对于每个点保持不变。然而,由于点云不是像图像的结构,所以固定邻权给出了不幸的感应偏压。我们提出了一个名为digress图卷积(diffconv)的新图表卷积,不依赖常规视图。DiffConv在空间 - 变化和密度扩张的邻域上操作,其进一步由学习屏蔽的注意机制进行了进一步调整。我们在ModelNet40点云分类基准测试中验证了我们的模型,获得最先进的性能和更稳健的噪声,以及更快的推广速度。
translated by 谷歌翻译
Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.
translated by 谷歌翻译
We present Kernel Point Convolution 1 (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KP-Conv more flexibility than fixed grid convolutions. Furthermore, these locations are continuous in space and can be learned by the network. Therefore, KPConv can be extended to deformable convolutions that learn to adapt kernel points to local geometry. Thanks to a regular subsampling strategy, KPConv is also efficient and robust to varying densities. Whether they use deformable KPConv for complex tasks, or rigid KPconv for simpler tasks, our networks outperform state-of-the-art classification and segmentation approaches on several datasets. We also offer ablation studies and visualizations to provide understanding of what has been learned by KPConv and to validate the descriptive power of deformable KPConv.
translated by 谷歌翻译
在本文中,我们提出了一个全面的点云语义分割网络,该网络汇总了本地和全球多尺度信息。首先,我们提出一个角度相关点卷积(ACPCONV)模块,以有效地了解点的局部形状。其次,基于ACPCONV,我们引入了局部多规模拆分(MSS)块,该块从一个单个块中连接到一个单个块中的特征,并逐渐扩大了接受场,这对利用本地上下文是有益的。第三,受HRNET的启发,在2D图像视觉任务上具有出色的性能,我们构建了一个针对Point Cloud的HRNET,以学习全局多尺度上下文。最后,我们介绍了一种融合多分辨率预测并进一步改善点云语义分割性能的点上的注意融合方法。我们在几个基准数据集上的实验结果和消融表明,与现有方法相比,我们提出的方法有效,能够实现最先进的性能。
translated by 谷歌翻译
随着激光雷达传感器和3D视觉摄像头的扩散,3D点云分析近年来引起了重大关注。经过先驱工作点的成功后,基于深度学习的方法越来越多地应用于各种任务,包括3D点云分段和3D对象分类。在本文中,我们提出了一种新颖的3D点云学习网络,通过选择性地执行具有动态池的邻域特征聚合和注意机制来提出作为动态点特征聚合网络(DPFA-NET)。 DPFA-Net有两个可用于三维云的语义分割和分类的变体。作为DPFA-NET的核心模块,我们提出了一个特征聚合层,其中每个点的动态邻域的特征通过自我注意机制聚合。与其他分割模型相比,来自固定邻域的聚合特征,我们的方法可以在不同层中聚合来自不同邻居的特征,在不同层中为查询点提供更具选择性和更广泛的视图,并更多地关注本地邻域中的相关特征。此外,为了进一步提高所提出的语义分割模型的性能,我们提出了两种新方法,即两级BF-Net和BF-Rengralization来利用背景前台信息。实验结果表明,所提出的DPFA-Net在S3DIS数据集上实现了最先进的整体精度分数,在S3DIS数据集上进行了语义分割,并在不同的语义分割,部分分割和3D对象分类中提供始终如一的令人满意的性能。与其他方法相比,它也在计算上更有效。
translated by 谷歌翻译
Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks. This limits their applicability to many real-world applications. In this paper we propose Parametric Continuous Convolution, a new learnable operator that operates over non-grid structured data. The key idea is to exploit parameterized kernel functions that span the full continuous vector space. This generalization allows us to learn over arbitrary data structures as long as their support relationship is computable. Our experiments show significant improvement over the state-ofthe-art in point cloud segmentation of indoor and outdoor scenes, and lidar motion estimation of driving scenes.
translated by 谷歌翻译
Raw point clouds data inevitably contains outliers or noise through acquisition from 3D sensors or reconstruction algorithms. In this paper, we present a novel endto-end network for robust point clouds processing, named PointASNL, which can deal with point clouds with noise effectively. The key component in our approach is the adaptive sampling (AS) module. It first re-weights the neighbors around the initial sampled points from farthest point sampling (FPS), and then adaptively adjusts the sampled points beyond the entire point cloud. Our AS module can not only benefit the feature learning of point clouds, but also ease the biased effect of outliers. To further capture the neighbor and long-range dependencies of the sampled point, we proposed a local-nonlocal (L-NL) module inspired by the nonlocal operation. Such L-NL module enables the learning process insensitive to noise. Extensive experiments verify the robustness and superiority of our approach in point clouds processing tasks regardless of synthesis data, indoor data, and outdoor data with or without noise. Specifically, PointASNL achieves state-of-theart robust performance for classification and segmentation tasks on all datasets, and significantly outperforms previous methods on real-world outdoor SemanticKITTI dataset with considerate noise. Our code is released through https: //github.com/yanx27/PointASNL.
translated by 谷歌翻译