基于学习的3D形状分割通常被配制为语义标记问题,假设训练形状的所有部分都用给定的一组标签注释。然而,这种假设对于学习细粒度的细分来说是不切实际的。虽然大多数现成的CAD模型是由施工组成的细粒度,但它们通常会错过语义标签并标记那些细粒度的部分非常乏味。我们接近深群体的问题,其中关键的想法是从带有细粒度分割的形状数据集中学习部分前导者,但没有部分标签。给定点采样3D形状,我们通过相似矩阵模拟点的聚类前沿,通过最小化新的低级损失来实现部分分割。为了处理高度密集的采样点集,我们采用了分裂和征服策略。我们将大点分区设置为多个块。每个块使用以类别 - 不可知方式培训的基于深度基于集群的基于网络的部分进行分段。然后,我们会培训图形卷积网络以合并所有块的段以形成最终的分段结果。我们的方法是用细粒细分的具有挑战性的基准进行评估,显示出最先进的性能。
translated by 谷歌翻译
We introduce Similarity Group Proposal Network (SGPN), a simple and intuitive deep learning framework for 3D object instance segmentation on point clouds. SGPN uses a single network to predict point grouping proposals and a corresponding semantic class for each proposal, from which we can directly extract instance segmentation results. Important to the effectiveness of SGPN is its novel representation of 3D instance segmentation results in the form of a similarity matrix that indicates the similarity between each pair of points in embedded feature space, thus producing an accurate grouping proposal for each point. Experimental results on various 3D scenes show the effectiveness of our method on 3D instance segmentation, and we also evaluate the capability of SGPN to improve 3D object detection and semantic segmentation results. We also demonstrate its flexibility by seamlessly incorporating 2D CNN features into the framework to boost performance.
translated by 谷歌翻译
Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.
translated by 谷歌翻译
Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective. Empirically, it shows strong performance on par or even better than state of the art. Theoretically, we provide analysis towards understanding of what the network has learnt and why the network is robust with respect to input perturbation and corruption.
translated by 谷歌翻译
大多数现有的点云实例和语义分割方法在很大程度上依赖于强大的监督信号,这需要场景中每个点的点级标签。但是,这种强大的监督遭受了巨大的注释成本,引起了研究有效注释的需求。在本文中,我们发现实例的位置对实例和语义3D场景细分都很重要。通过充分利用位置,我们设计了一种弱监督的点云分割算法,该算法仅需要单击每个实例以指示其注释的位置。通过进行预处理过度分割,我们将这些位置注释扩展到seg级标签中。我们通过将未标记的片段分组分组到相关的附近标签段中,进一步设计一个段分组网络(SEGGROUP),以在SEG级标签下生成点级伪标签,以便现有的点级监督的分段模型可以直接消耗这些PSEUDO标签为了训练。实验结果表明,我们的SEG级监督方法(SEGGROUP)通过完全注释的点级监督方法获得了可比的结果。此外,在固定注释预算的情况下,它的表现优于最近弱监督的方法。
translated by 谷歌翻译
3D点云的卷积经过广泛研究,但在几何深度学习中却远非完美。卷积的传统智慧在3D点之间表现出特征对应关系,这是对差的独特特征学习的内在限制。在本文中,我们提出了自适应图卷积(AGCONV),以供点云分析的广泛应用。 AGCONV根据其动态学习的功能生成自适应核。与使用固定/各向同性核的解决方案相比,AGCONV提高了点云卷积的灵活性,有效,精确地捕获了不同语义部位的点之间的不同关系。与流行的注意力体重方案不同,AGCONV实现了卷积操作内部的适应性,而不是简单地将不同的权重分配给相邻点。广泛的评估清楚地表明,我们的方法优于各种基准数据集中的点云分类和分割的最新方法。同时,AGCONV可以灵活地采用更多的点云分析方法来提高其性能。为了验证其灵活性和有效性,我们探索了基于AGCONV的完成,DeNoing,Upsmpling,注册和圆圈提取的范式,它们与竞争对手相当甚至优越。我们的代码可在https://github.com/hrzhou2/adaptconv-master上找到。
translated by 谷歌翻译
We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information. Our dataset consists of 573,585 part instances over 26,671 3D models covering 24 object categories. This dataset enables and serves as a catalyst for many tasks such as shape analysis, dynamic 3D scene modeling and simulation, affordance analysis, and others. Using our dataset, we establish three benchmarking tasks for evaluating 3D part recognition: fine-grained semantic segmentation, hierarchical semantic segmentation, and instance segmentation. We benchmark four state-ofthe-art 3D deep learning algorithms for fine-grained semantic segmentation and three baseline methods for hierarchical semantic segmentation. We also propose a novel method for part instance segmentation and demonstrate its superior performance over existing methods.
translated by 谷歌翻译
我们提出切碎,这是一种3D形状区域分解的方法。 Shred将3D点云作为输入,并使用学习的本地操作来产生近似细粒零件实例的分割。我们将切碎的分解操作赋予了三个分解操作:分裂区域,固定区域之间的边界,并将区域合并在一起。模块经过独立和本地培训,使切碎可以为在培训过程中未见的类别生成高质量的细分。我们通过Partnet的细粒细分进行训练和评估切碎;使用其合并 - 阈值超参数,我们表明,在任何所需的分解粒度下,切碎的分割可以更好地尊重与基线方法相比,更好地尊重地面真相的注释。最后,我们证明切碎对于下游应用非常有用,在零弹药细粒的零件实例分割上的所有基准都超过了所有基准,并且当与学习标记形状区域的方法结合使用时,几乎没有发射细粒的语义分割。
translated by 谷歌翻译
We introduce a novel deep learning-based framework to interpret 3D urban scenes represented as textured meshes. Based on the observation that object boundaries typically align with the boundaries of planar regions, our framework achieves semantic segmentation in two steps: planarity-sensible over-segmentation followed by semantic classification. The over-segmentation step generates an initial set of mesh segments that capture the planar and non-planar regions of urban scenes. In the subsequent classification step, we construct a graph that encodes the geometric and photometric features of the segments in its nodes and the multi-scale contextual features in its edges. The final semantic segmentation is obtained by classifying the segments using a graph convolutional network. Experiments and comparisons on two semantic urban mesh benchmarks demonstrate that our approach outperforms the state-of-the-art methods in terms of boundary quality, mean IoU (intersection over union), and generalization ability. We also introduce several new metrics for evaluating mesh over-segmentation methods dedicated to semantic segmentation, and our proposed over-segmentation approach outperforms state-of-the-art methods on all metrics. Our source code is available at \url{https://github.com/WeixiaoGao/PSSNet}.
translated by 谷歌翻译
We propose a novel deep learning-based framework to tackle the challenge of semantic segmentation of largescale point clouds of millions of points. We argue that the organization of 3D point clouds can be efficiently captured by a structure called superpoint graph (SPG), derived from a partition of the scanned scene into geometrically homogeneous elements. SPGs offer a compact yet rich representation of contextual relationships between object parts, which is then exploited by a graph convolutional network. Our framework sets a new state of the art for segmenting outdoor LiDAR scans (+11.9 and +8.8 mIoU points for both Semantic3D test sets), as well as indoor scans (+12.4 mIoU points for the S3DIS dataset).
translated by 谷歌翻译
我们介绍了PartGolot,神经框架和相关架构,用于学习3D形几何的语义部分分割,仅基于部分参照语言。我们利用形状的语言描述可以提供形状的部分的前瞻性 - 因为自然语言已经发展,以反映对物体的组成结构的人类感知,对其认可和使用至关重要。对于培训,我们使用Shapeglot工作中收集的成对几何/语言数据来为其参考游戏,其中扬声器创建话语以区分从两个牵引器的目标形状,并且听众必须基于这种话语找到目标。我们的网络旨在解决此目标辨别问题,仔细介绍基于变压器的注意模块,以便输出注意力可以精确地突出显示语言中描述的语义部件或零件。此外,网络在3D几何形状本身上没有任何直接监督。令人惊讶的是,我们进一步证明学习部分信息是概括的,可以在训练期间形状看不见。我们的方法打开了单独从语言学习3D形状的可能性,而无需大规模部分几何注释,从而促进注释采集。
translated by 谷歌翻译
通过当地地区的点特征聚合来捕获的细粒度几何是对象识别和场景理解在点云中的关键。然而,现有的卓越点云骨架通常包含最大/平均池用于局部特征聚集,这在很大程度上忽略了点的位置分布,导致细粒结构组装不足。为了缓解这一瓶颈,我们提出了一个有效的替代品,可以使用新颖的图形表示明确地模拟了本地点之间的空间关系,并以位置自适应方式聚合特征,从而实现位置敏感的表示聚合特征。具体而言,Papooling分别由两个关键步骤,图形结构和特征聚合组成,分别负责构造与将中心点连接的边缘与本地区域中的每个相邻点连接的曲线图组成,以将它们的相对位置信息映射到通道 - 明智的细心权重,以及基于通过图形卷积网络(GCN)的生成权重自适应地聚合局部点特征。 Papooling简单而且有效,并且足够灵活,可以随时为PointNet ++和DGCNN等不同的流行律源,作为即插即说运算符。关于各种任务的广泛实验,从3D形状分类,部分分段对场景分割良好的表明,伪装可以显着提高预测准确性,而具有最小的额外计算开销。代码将被释放。
translated by 谷歌翻译
我们提出了一种基于动态卷积的3D点云的实例分割方法。这使其能够在推断时适应变化的功能和对象尺度。这样做避免了一些自下而上的方法的陷阱,包括对超参数调整和启发式后处理管道的依赖,以弥补物体大小的不可避免的可变性,即使在单个场景中也是如此。通过收集具有相同语义类别并为几何质心进行仔细投票的均匀点,网络的表示能力大大提高了。然后通过几个简单的卷积层解码实例,其中参数是在输入上生成的。所提出的方法是无建议的,而是利用适应每个实例的空间和语义特征的卷积过程。建立在瓶颈层上的轻重量变压器使模型可以捕获远程依赖性,并具有有限的计算开销。结果是一种简单,高效且健壮的方法,可以在各种数据集上产生强大的性能:ScannETV2,S3DIS和Partnet。基于体素和点的体系结构的一致改进意味着提出的方法的有效性。代码可在以下网址找到:https://git.io/dyco3d
translated by 谷歌翻译
我们提出了神经引导的形状解析器(NGSP),一种方法,该方法学习如何将细粒度语义标签分配给3D形状的区域。 NGSP通过MAP推断解决了这个问题,在输入形状上建模了标签分配的后验概率,其具有学习的似然函数。为了使这次搜索易于进行,NGSP采用神经指南网络,了解近似后部。 NGSP通过使用引导网络的第一次采样提案找到高概率标签分配,然后在完全可能性下评估每个提案。我们评估NGSP从Partnet的制造3D形状的细粒度语义分割任务,其中形状被分解成对应于零件实例过分分割的区域。我们发现NGSP通过比较方法提供显着的性能改进,(i)使用区域对分组每点预测,(ii)使用区域作为自我监督信号或(iii)将标签分配给替代配方下的区域。此外,我们表明,即使具有有限的标记数据或作为形状区域经历人为腐败,NGSP即使具有有限的人为腐败,也会保持强劲的性能。最后,我们证明了NGSP可以直接应用于在线存储库中的CAD形状,并验证其效力与感知研究。
translated by 谷歌翻译
Generalizable 3D part segmentation is important but challenging in vision and robotics. Training deep models via conventional supervised methods requires large-scale 3D datasets with fine-grained part annotations, which are costly to collect. This paper explores an alternative way for low-shot part segmentation of 3D point clouds by leveraging a pretrained image-language model, GLIP, which achieves superior performance on open-vocabulary 2D detection. We transfer the rich knowledge from 2D to 3D through GLIP-based part detection on point cloud rendering and a novel 2D-to-3D label lifting algorithm. We also utilize multi-view 3D priors and few-shot prompt tuning to boost performance significantly. Extensive evaluation on PartNet and PartNet-Mobility datasets shows that our method enables excellent zero-shot 3D part segmentation. Our few-shot version not only outperforms existing few-shot approaches by a large margin but also achieves highly competitive results compared to the fully supervised counterpart. Furthermore, we demonstrate that our method can be directly applied to iPhone-scanned point clouds without significant domain gaps.
translated by 谷歌翻译
您将如何修复大量错过的物理物体?您可能首先恢复其全球且粗糙的形状,并逐步增加其本地细节。我们有动力模仿上述物理维修程序,以解决点云完成任务。我们为各种3D模型提出了一个新颖的逐步点云完成网络(SPCNET)。 SPCNET具有层次的底部网络体系结构。它以迭代方式实现形状完成,1)首先扩展了粗糙结果的全局特征; 2)然后在全球功能的帮助下注入本地功能; 3)最终借助局部特征和粗糙的结果来渗透详细的结果。除了模拟物理修复的智慧之外,我们还新设计了基于周期损失%的训练策略,以增强SPCNET的概括和鲁棒性。广泛的实验清楚地表明了我们的SPCNET优于3D点云上最先进的方法,但错过了很大。
translated by 谷歌翻译
Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. In this paper, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution kernels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks and density functions through kernel density estimation. The most important contribution of this work is a novel reformulation proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. Besides, PointConv can also be used as deconvolution operators to propagate features from a subsampled point cloud back to its original resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep convolutional neural networks built on PointConv are able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds. Besides, our experiments converting CIFAR-10 into a point cloud showed that networks built on PointConv can match the performance of convolutional networks in 2D images of a similar structure.
translated by 谷歌翻译
由于激光雷达扫描数据的大规模,噪音和数据不完整,注册Urban Point Clouds是一项艰巨的任务。在本文中,我们提出了SARNET,这是一个新型的语义增强注册网络,旨在在城市规模上实现有效的城市点云的注册。与以前仅在点级空间中构建对应关系的方法不同,我们的方法完全利用语义特征来提高注册精度。具体而言,我们提取具有高级语义分割网络的每点语义标签,并构建先前的语义零件到部分对应关系。然后,我们将语义信息纳入基于学习的注册管道中,该管道由三个核心模块组成:基于语义的最远点采样模块,以有效地滤除异常值和动态对象;一个语义增强的特征提取模块,用于学习更多的判别点描述符;语义改制的转换估计模块,该模块利用先前的语义匹配作为掩码,通过减少错误匹配以更好地收敛来完善点对应关系。我们通过使用来自城市场景的大区域的现实世界数据并将其与替代方法进行比较,从而广泛评估所提出的SARNET。该代码可在https://github.com/wintercodeforeverything/sarnet上找到。
translated by 谷歌翻译
基于几何点云压缩(G-PCC)可以为点云实现显着的压缩效率。但是,它仍然导致严重的属性压缩伪影,尤其是在低比特率方案下。在本文中,我们提出了一个多尺度图注意网络(MS-GAT),以删除由G-PCC压缩的点云属性的伪影。我们首先构建基于点云几何坐标的图形,然后使用Chebyshev Graph卷曲来提取点云属性的特征。考虑到一个点可以与离IT附近和远离它的点来相关,我们提出了一种多尺度方案来捕获当前点与其相邻和远处的远程之间的短距离和长距离相关性。为了解决各种点可能具有由自适应量化引起的不同程度的不同程度的问题,我们将量化步骤介绍为对所提出的网络的额外输入。我们还将图形注意力层纳入网络中,以特别关注具有更多属性工件的点。据我们所知,这是G-PCC的第一个属性伪影删除方法。我们在各种点云上验证了我们方法的有效性。实验结果表明,我们的提出方法平均降低了9.28%的BD速率。此外,我们的方法可以实现下游点云语义分割任务的一些性能改进。
translated by 谷歌翻译