Segmentation is an important step in many perception tasks, such as object detection and recognition. We present an approach to organized point cloud segmentation and its application to plane segmentation, and euclidean clustering for tabletop object detection. The proposed approach is efficient and enables real-time plane segmentation for VGA resolution RGB-D data. Timing results are provided for indoor datasets, and applications to tabletop object segmentation and mapping with planar landmarks are discussed.
translated by 谷歌翻译
本文介绍了CAPE,一种从有组织的点云中提取平面和圆柱段的方法,它通过在平面单元网格上操作,在单个CPU核心上以平均300 Hz的速度处理640x480深度图像。然而,与状态相比最先进的平面提取,CAPE的延迟更加一致,快4-10倍,取决于场景,我们通过经验证明将CAPE应用于视觉测距可以改善由圆柱表面(例如隧道)制成的场景的轨迹估计,而使用不是曲线感知的平面提取方法会降低这些场景的性能。为了在视觉测量中使用这些几何图元,我们建议将基于点,线和平面的概率RGB-D测距框架扩展到圆柱基元。遵循该框架,CAPE在融合的深度图上运行,并且圆柱的参数被概率地建模以相应地考虑不确定性和重量的姿势优化残差。
translated by 谷歌翻译
The sheer volume of data generated by depth cameras provides a challenge to process in real time, in particular when used for indoor mobile robot localization and navigation. We introduce the Fast Sampling Plane Filtering (FSPF) algorithm to reduce the volume of the 3D point cloud by sampling points from the depth image, and classifying local grouped sets of points as belonging to planes in 3D (the "plane filtered" points) or points that do not correspond to planes within a specified error margin (the "outlier" points). We then introduce a localization algorithm based on an observation model that down-projects the plane filtered points on to 2D, and assigns correspondences for each point to lines in the 2D map. The full sampled point cloud (consisting of both plane filtered as well as outlier points) is processed for obstacle avoidance for autonomous navigation. All our algorithms process only the depth information, and do not require additional RGB data. The FSPF, localization and obstacle avoidance algorithms run in real time at full camera frame rates with low CPU requirements at more than 1030 frames per second on average. We provide experimental results demonstrating the effectiveness of our approach for indoor mobile robot localization and navigation. We further compare the accuracy and robustness in localization using depth cameras with FSPF vs. alternative approaches which simulate laser rangefinder scans from the 3D data.
translated by 谷歌翻译
In this work we present an automatic algorithm to detect basic shapes in unorganized point clouds. The algorithm decomposes the point cloud into a concise, hybrid structure of inherent shapes and a set of remaining points. Each detected shape serves as a proxy for a set of corresponding points. Our method is based on random sampling and detects planes, spheres, cylinders, cones and tori. For models with surfaces composed of these basic shapes only, e.g. CAD models, we automatically obtain a representation solely consisting of shape proxies. We demonstrate that the algorithm is robust even in the presence of many outliers and a high degree of noise. The proposed method scales well with respect to the size of the input point cloud and the number and size of the shapes within the data. Even point sets with several millions of samples are robustly decomposed within less than a minute. Moreover the algorithm is conceptually simple and easy to implement. Application areas include measurement of physical parameters, scan registration, surface compression, hybrid rendering, shape classification, meshing, simplification, approximation and reverse engineering.
translated by 谷歌翻译
本文提出了一种非常简单但有效的大规模无组织点云三维线段检测算法。与通常首先提取3D边缘点然后将它们链接以适合3Dline线段的传统方法不同,我们提出了基于点云分割和2D线检测的非常简单的3D线段检测算法。给定输入无组织点云,执行三步检测三维线段。首先,通过区域增长和区域合并将点云分割成三维平面。其次,对于每个3D平面,属于它的所有点都投影到平面本身上以形成2D图像,然后进行2D轮廓提取和最小二乘拟合以获得2D线段。然后将那些2D线段重新投影到3D平面上以获得相应的3D线段。最后,提出了后处理过程以消除异常值并合并相邻的3D线段。几个公共数据集的实验证明了我们的方法的效率和稳健性。更多结果和所提算法的C ++源代码可在https://github.com/xiaohulugo/3DLineDetection上公开获得。
translated by 谷歌翻译
Creating 3D maps on robots and other mobile devices has become a reality in recent years. Online 3D reconstruction enables many exciting applications in robotics and AR/VR gaming. However, the reconstructions are noisy and generally incomplete. Moreover, during online reconstruction, the surface changes with every newly integrated depth image which poses a significant challenge for physics engines and path planning algorithms. This paper presents a novel, fast and robust method for obtaining and using information about planar surfaces, such as walls, floors, and ceilings as a stage in 3D reconstruction based on Signed Distance Fields (SDFs). Our algorithm recovers clean and accurate surfaces, reduces the movement of individual mesh vertices caused by noise during online reconstruction and fills in the occluded and unobserved regions. We implemented and evaluated two different strategies to generate plane candidates and two strategies for merging them. Our implementation is optimized to run in real-time on mobile devices such as the Tango tablet. In an extensive set of experiments, we validated that our approach works well in a large number of natural environments despite the presence of significant amount of occlusion, clutter and noise, which occur frequently. We further show that plane fitting enables in many cases a meaningful semantic segmentation of real-world scenes.
translated by 谷歌翻译
我们提出了一种新颖有效的方法来检测三维原始图像,包括无腋窝分割或类型规范。我们认为二次曲面用于以统一的方式封装我们环境的基本建筑块 - 平面,球体,椭圆体,圆锥体或圆柱体。此外,二次曲线允许我们模拟更高程度的自由形状,例如可以在非刚性设置中使用的双曲面或抛物面。我们首先提出两个新的二次拟合,针对具有切线空间信息的3D点集。基于将二次曲面与曲面法线对齐的想法,我们的第一个公式是精确的,并且需要低至四个定向点。第二个拟合近似于第一个,并减少了计算工作量。我们从理论上分析了这些拟合,并给出了代数和几何论证。接下来,通过重新参数化解决方案,我们在与RANSAC结合的零空间系数上设计了一个新的局部Hough投票方案,将复杂度从$ O(N ^ 4)$降低到$ O(N ^ 3)$(三点)。据我们所知,这是第一种能够在困难的场景中执行通用的跨类型多对象基元检测而无需分割的方法。我们广泛的定性和定量结果表明,我们的方法既有效又灵活,而且准确。
translated by 谷歌翻译
大多数LiDAR测距算法通过以插值方式估计旋转和平移来估计两个连续帧之间的变换。在本文中,我们提出了解耦激光雷达测距仪(DeLiO),它首次将旋转估计与翻译估计完全解耦。具体地,通过从输入点云提取表面法线并在单位球上跟踪它们的特征图案来估计旋转。使用此旋转,点云未旋转,因此底层转换是纯转换,可以使用线云方法轻松估算。在KITTI数据集上进行评估,并将结果与​​现有技术进行比较。
translated by 谷歌翻译
We present a new, massively parallel method for high-quality multiview matching. Our work builds on the Patch-match idea: starting from randomly generated 3D planes in scene space, the best-fitting planes are iteratively propagated and refined to obtain a 3D depth and normal field per view, such that a robust photo-consistency measure over all images is maximized. Our main novelties are on the one hand to formulate Patchmatch in scene space, which makes it possible to aggregate image similarity across multiple views and obtain more accurate depth maps. And on the other hand a modified, diffusion-like propagation scheme that can be massively parallelized and delivers dense mul-tiview correspondence over ten 1.9-Megapixel images in 3 seconds, on a consumer-grade GPU. Our method uses a slanted support window and thus has no fronto-parallel bias; it is completely local and parallel, such that computation time scales linearly with image size, and inversely proportional to the number of parallel threads. Furthermore , it has low memory footprint (four values per pixel, independent of the depth range). It therefore scales exceptionally well and can handle multiple large images at high depth resolution. Experiments on the DTU and Middlebury multiview datasets as well as oblique aerial images show that our method achieves very competitive results with high accuracy and completeness, across a range of different scenarios .
translated by 谷歌翻译
The area of surface reconstruction has seen substantial progress in the past two decades. The traditional problem addressed by surface reconstruction is to recover the digital representation of a physical shape that has been scanned, where the scanned data contains a wide variety of defects. While much of the earlier work has been focused on reconstructing a piece-wise smooth representation of the original shape, recent work has taken on more specialized priors to address significantly challenging data imperfections, where the reconstruction can take on different representations-not necessarily the explicit geometry. We survey the field of surface reconstruction, and provide a categorization with respect to priors, data imperfections, and reconstruction output. By considering a holistic view of surface reconstruction, we show a detailed characterization of the field, highlight similarities between diverse reconstruction techniques, and provide directions for future work in surface reconstruction.
translated by 谷歌翻译
We present an automatic approach for the reconstruction of parametric 3D building models from indoor point clouds. While recently developed methods in this domain focus on mere local surface reconstructions which enable e.g. efficient visualization, our approach aims for a volumetric, parametric building model that additionally incorporates contextual information such as global wall connectivity. In contrast to pure surface reconstructions, our representation thereby allows more comprehensive use: first, it enables efficient high-level editing operations in terms of e.g. wall removal or room reshaping which always result in a topologically consistent representation. Second, it enables easy taking of measurements like e.g. determining wall thickness or room areas. These properties render our reconstruction method especially beneficial to architects or engineers for planning renovation or retrofitting. Following the idea of previous approaches, the reconstruction task is cast as a labeling problem which is solved by an energy minimization. This global optimization approach allows for the reconstruction of wall elements shared between rooms while simultaneously maintaining plausible connectivity between all wall elements. An automatic prior segmentation of the point clouds into rooms and outside area filters large-scale outliers and yields priors for the definition of labeling costs for the energy minimization. The reconstructed model is further enriched by detected doors and windows. We demonstrate the applicability and reconstruction power of our new approach on a variety of complex real-world datasets requiring little or no parameter adjustment.
translated by 谷歌翻译
3D扫描技术的最新发展使得高精度3D点云的生成相对容易,但这些点云的分割仍然是一个具有挑战性的领域。许多技术已经在文献中建立了平面或基于原始的分割。在这项工作中,我们提出了一种新颖且有效的基于原始的点云分割算法。主要焦点,即我们方法的主要技术贡献是分层树,其迭代地将点云划分为段。该树使用专用能量函数和3D卷积神经网络HollowNets对片段进行分类。我们使用真实和合成数据测试我们提出的方法的功效,获得圆顶和尖塔的准确度大于90%。
translated by 谷歌翻译
Unsupervised over-segmentation of an image into regions of perceptually similar pixels, known as superpix-els, is a widely used preprocessing step in segmentation algorithms. Superpixel methods reduce the number of regions that must be considered later by more computation-ally expensive algorithms, with a minimal loss of information. Nevertheless, as some information is inevitably lost, it is vital that superpixels not cross object boundaries, as such errors will propagate through later steps. Existing methods make use of projected color or depth information, but do not consider three dimensional geometric relationships between observed data points which can be used to prevent superpixels from crossing regions of empty space. We propose a novel over-segmentation algorithm which uses voxel relationships to produce over-segmentations which are fully consistent with the spatial geometry of the scene in three dimensional, rather than projective, space. Enforcing the constraint that segmented regions must have spatial connectivity prevents label flow across semantic object boundaries which might otherwise be violated. Additionally , as the algorithm works directly in 3D space, observations from several calibrated RGB+D cameras can be segmented jointly. Experiments on a large data set of human annotated RGB+D images demonstrate a significant reduction in occurrence of clusters crossing object boundaries, while maintaining speeds comparable to state-of-the-art 2D methods.
translated by 谷歌翻译
Measuring semantic traits for phenotyping is an essential but labor-intensive activity in horticulture. Researchers often rely on manual measurements which may not be accurate for tasks such as measuring tree volume. To improve the accuracy of such measurements and to automate the process, we consider the problem of building coherent three dimensional (3D) reconstructions of orchard rows. Even though 3D reconstructions of side views can be obtained using standard mapping techniques, merging the two side-views is difficult due to the lack of overlap between the two partial reconstructions. Our first main contribution in this paper is a novel method that utilizes global features and semantic information to obtain an initial solution aligning the two sides. Our mapping approach then refines the 3D model of the entire tree row by integrating semantic information common to both sides, and extracted using our novel robust detection and fitting algorithms. Next, we present a vision system to measure semantic traits from the optimized 3D model that is built from the RGB or RGB-D data captured by only a camera. Specifically, we show how canopy volume, trunk diameter, tree height and fruit count can be automatically obtained in real orchard environments. The experiment results from multiple datasets quantitatively demonstrate the high accuracy and robustness of our method.
translated by 谷歌翻译
In this paper, we present an approach to segment 3D point cloud data using ideas from persistent homology theory. The proposed algorithms first generate a simplicial complex representation of the point cloud dataset. Next, we compute the zeroth homology group of the complex which corresponds to the number of connected components. Finally, we extract the clusters of each connected component in the dataset. We show that this technique has several advantages over state of the art methods such as the ability to provide a stable segmentation of point cloud data under noisy or poor sampling conditions and its independence of a fixed distance metric.
translated by 谷歌翻译
The area of surface reconstruction has seen substantial progress in the past two decades. The traditional problem addressed by surface reconstruction is to recover the digital representation of a physical shape that has been scanned, where the scanned data contains a wide variety of defects. While much of the earlier work has been focused on reconstructing a piece-wise smooth representation of the original shape, recent work has taken on more specialized priors to address significantly challenging data imperfections, where the reconstruction can take on different representations-not necessarily the explicit geometry. This state-of-the-art report surveys the field of surface reconstruction, providing a categorization with respect to priors, data imperfections, and reconstruction output. By considering a holistic view of surface reconstruction, this report provides a detailed characterization of the field, highlights similarities between diverse reconstruction techniques, and provides directions for future work in surface reconstruction.
translated by 谷歌翻译
Place recognition in 3D data is a challenging task that has been commonly approached by adapting image-based solutions. Methods based on local features suffer from ambiguity and from robustness to environment changes while methods based on global features are viewpoint dependent. We propose SegMatch, a reliable place recognition algorithm based on the matching of 3D segments. Segments provide a good compromise between local and global descriptions, incorporating their strengths while reducing their individual drawbacks. SegMatch does not rely on assumptions of 'perfect segmentation', or on the existence of 'objects' in the environment , which allows for reliable execution on large scale, unstructured environments. We quantitatively demonstrate that SegMatch can achieve accurate localization at a frequency of 1Hz on the largest sequence of the KITTI odometry dataset. We furthermore show how this algorithm can reliably detect and close loops in real-time, during online operation. In addition, the source code for the SegMatch algorithm is made publicly available 1 .
translated by 谷歌翻译
This paper addresses the problem of recognizing free-form 3D objects in point clouds. Compared to traditional approaches based on point descriptors, which depend on local information around points, we propose a novel method that creates a global model description based on oriented point pair features and matches that model locally using a fast voting scheme. The global model description consists of all model point pair features and represents a mapping from the point pair feature space to the model, where similar features on the model are grouped together. Such representation allows using much sparser object and scene point clouds, resulting in very fast performance. Recognition is done locally using an efficient voting scheme on a reduced two-dimensional search space. We demonstrate the efficiency of our approach and show its high recognition performance in the case of noise, clutter and partial occlusions. Compared to state of the art approaches we achieve better recognition rates, and demonstrate that with a slight or even no sacrifice of the recognition performance our method is much faster then the current state of the art approaches.
translated by 谷歌翻译
Place recognition in 3D data is a challenging task that has been commonly approached by adapting image-based solutions. Methods based on local features suffer from ambiguity and from robustness to environment changes while methods based on global features are viewpoint dependent. We propose SegMatch, a reliable place recognition algorithm based on the matching of 3D segments. Segments provide a good compromise between local and global descriptions, incorporating their strengths while reducing their individual drawbacks. SegMatch does not rely on assumptions of 'perfect segmentation', or on the existence of 'objects' in the environment , which allows for reliable execution on large scale, unstructured environments. We quantitatively demonstrate that SegMatch can achieve accurate localization at a frequency of 1Hz on the largest sequence of the KITTI odometry dataset. We furthermore show how this algorithm can reliably detect and close loops in real-time, during online operation. In addition, the source code for the SegMatch algorithm is made publicly available 1 .
translated by 谷歌翻译