智能论文笔记

Efficient variants of the ICP algorithm

分类：

The ICP (Iterative Closest Point) algorithm is widely used for geometric alignment of three-dimensional models when an initial estimate of the relative pose is known. Many variants of ICP have been proposed, affecting all phases of the algorithm from the selection and matching of points to the minimization strategy. We enumerate and classify many of these variants, and evaluate their effect on the speed with which the correct alignment is reached. In order to improve convergence for nearly-flat meshes with small features, such as inscribed surfaces, we introduce a new variant based on uniform sampling of the space of normals. We conclude by proposing a combination of ICP variants optimized for high speed. We demonstrate an implementation that is able to align two range images in a few tens of milliseconds, assuming a good initial guess. This capability has potential application to real-time 3D model acquisition and model-based tracking.

translated by 谷歌翻译

Generalized-icp.

分类：

In this paper we combine the Iterative Closest Point ('ICP') and 'point-to-plane ICP' algorithms into a single probabilistic framework. We then use this framework to model locally planar surface structure from both scans instead of just the "model" scan as is typically done with the point-to-plane method. This can be thought of as 'plane-to-plane'. The new approach is tested with both simulated and real-world data and is shown to outperform both standard ICP and point-to-plane. Furthermore, the new approach is shown to be more robust to incorrect correspondences, and thus makes it easier to tune the maximum match distance parameter present in most variants of ICP. In addition to the demonstrated performance improvement, the proposed model allows for more expressive probabilistic models to be incorporated into the ICP framework. While maintaining the speed and simplicity of ICP, the Generalized-ICP could also allow for the addition of outlier terms, measurement noise, and other probabilistic techniques to increase robustness.

translated by 谷歌翻译

Registration Techniques for Deformable Objects

Alireza Ahmadi

分类：计算机视觉

2021-11-07

通常，非刚性登记的问题是匹配在两个不同点拍摄的动态对象的两个不同扫描。这些扫描可以进行刚性动作和非刚性变形。由于模型的新部分可能进入视图，而其他部件在两个扫描之间堵塞，则重叠区域是两个扫描的子集。在最常规的设置中，没有给出先前的模板形状，并且没有可用的标记或显式特征点对应关系。因此，这种情况是局部匹配问题，其考虑了随后的扫描在具有大量重叠区域的情况下进行的扫描经历的假设[28]。本文在环境中寻址的问题是同时在环境中映射变形对象和本地化摄像机。

translated by 谷歌翻译

3D Labeling Tool

John Rachwan , Charbel Zalaket

分类：计算机视觉 | 人工智能

2022-07-23

培训和测试监督对象检测模型需要大量带有地面真相标签的图像。标签定义图像中的对象类及其位置，形状以及可能的其他信息，例如姿势。即使存在人力，标签过程也非常耗时。我们引入了一个新的标签工具，用于2D图像以及3D三角网格：3D标记工具（3DLT）。这是一个独立的，功能丰富和跨平台软件，不需要安装，并且可以在Windows，MacOS和基于Linux的发行版上运行。我们不再像当前工具那样在每个图像上分别标记相同的对象，而是使用深度信息从上述图像重建三角形网格，并仅在上述网格上标记一次对象。我们使用注册来简化3D标记，离群值检测来改进2D边界框的计算和表面重建，以将标记可能性扩展到大点云。我们的工具经过最先进的方法测试，并且在保持准确性和易用性的同时，它极大地超过了它们。

translated by 谷歌翻译

Fast and Robust Non-Rigid Registration Using Accelerated Majorization-Minimization

Yuxin Yao , Bailin Deng , Weiwei Xu , Juyong Zhang

分类：计算机视觉

2022-06-07

非刚性注册以非刚性方式与目标形状保持一致的源形状变形，是计算机视觉中的经典问题。由于数据（噪声，离群值和部分重叠）和高度自由度，因此此类问题可能具有挑战性。现有方法通常采用$ \ ell_ {p} $键入鲁棒标准来测量对齐误差并规范变形的平滑度，并使用近端算法来解决所得的非平滑优化问题。但是，这种算法的缓慢收敛性限制了其广泛的应用。在本文中，我们提出了一种基于全球平稳的稳健标准进行对齐和正则化的稳健非刚性登记的公式，该规范可以有效地处理异常值和部分重叠。使用大型最小化算法解决了该问题，该算法将每次迭代减少到使用封闭形式的解决方案的凸二次问题。我们进一步应用安德森加速度以加快求解器的收敛性，使求解器能够在具有有限的计算能力的设备上有效运行。广泛的实验证明了我们方法在两种形状之间具有异常值和部分重叠的形状之间的非刚性比对的有效性，并进行定量评估表明，就注册准确性和计算速度而言，它的表现优于最先进的方法。源代码可从https://github.com/yaoyx689/amm_nrr获得。

translated by 谷歌翻译

Lidar-level localization with radar? The CFEAR approach to accurate, fast and robust large-scale radar odometry in diverse environments

Daniel Adolfsson , Martin Magnusson , Anas Alhashimi , Achim J. Lilienthal , Henrik Andreasson

分类：机器人

2022-11-04

This paper presents an accurate, highly efficient, and learning-free method for large-scale odometry estimation using spinning radar, empirically found to generalize well across very diverse environments -- outdoors, from urban to woodland, and indoors in warehouses and mines - without changing parameters. Our method integrates motion compensation within a sweep with one-to-many scan registration that minimizes distances between nearby oriented surface points and mitigates outliers with a robust loss function. Extending our previous approach CFEAR, we present an in-depth investigation on a wider range of data sets, quantifying the importance of filtering, resolution, registration cost and loss functions, keyframe history, and motion compensation. We present a new solving strategy and configuration that overcomes previous issues with sparsity and bias, and improves our state-of-the-art by 38%, thus, surprisingly, outperforming radar SLAM and approaching lidar SLAM. The most accurate configuration achieves 1.09% error at 5Hz on the Oxford benchmark, and the fastest achieves 1.79% error at 160Hz.

translated by 谷歌翻译

Kinectfusion: Real-time dense surface mapping and tracking

分类：

Figure 1: Example output from our system, generated in real-time with a handheld Kinect depth camera and no other sensing infrastructure. Normal maps (colour) and Phong-shaded renderings (greyscale) from our dense reconstruction system are shown. On the left for comparison is an example of the live, incomplete, and noisy data from the Kinect sensor (used as input to our system).

translated by 谷歌翻译

Elastic shape analysis of surfaces with second-order Sobolev metrics: a comprehensive numerical framework

Emmanuel Hartman , Yashil Sukurdeep , Eric Klassen , Nicolas Charon , Martin Bauer

分类：计算机视觉

2022-04-08

本文介绍了一组数字方法，用于在不变（弹性）二阶Sobolev指标的设置中对3D表面进行Riemannian形状分析。更具体地说，我们解决了代表为3D网格的参数化或未参数浸入式表面之间的测量学和地球距离的计算。在此基础上，我们为表面集的统计形状分析开发了工具，包括用于估算Karcher均值并在形状群体上执行切线PCA的方法，以及计算沿表面路径的平行传输。我们提出的方法从根本上依赖于通过使用Varifold Fidelity术语来为地球匹配问题提供轻松的变异配方，这使我们能够在计算未参数化表面之间的地理位置时强制执行重新训练的独立性，同时还可以使我们能够与多用途算法相比，使我们能够将表面与vare表面进行比较。采样或网状结构。重要的是，我们演示了如何扩展放松的变分框架以解决部分观察到的数据。在合成和真实的各种示例中，说明了我们的数值管道的不同好处。

translated by 谷歌翻译

What's Behind the Couch? Directed Ray Distance Functions (DRDF) for 3D Scene Reconstruction

Nilesh Kulkarni , Justin Johnson , David F. Fouhey

分类：计算机视觉

2021-12-08

我们从看不见的RGB图像提出了一种场景级3D重建，包括遮挡区域的方法。我们的方法是在真正的3D扫描和图像上培训。由于多种原因，这个问题已经证明很难;真正的扫描不是防水，禁止许多方法;场景中的距离需要推理跨对象（使其更加困难）;并且，正如我们所示，表面位置的不确定性激励网络以产生缺少基本距离功能属性的输出。我们提出了一种新的距离样功能，可以在非结构化扫描上计算，并且在对表面位置的不确定性下具有良好的行为。计算此功能在光线上可进一步降低复杂性。我们训练一个深度网络来预测此功能，并显示出于TASTPORT3D，3D前面和SCANNET上的其他方法。

translated by 谷歌翻译

A database and evaluation methodology for optical flow

分类：

The quantitative evaluation of optical flow algorithms by Barron et al. (1994) led to significant advances in performance. The challenges for optical flow algorithms today go beyond the datasets and evaluation methods proposed in that paper. Instead, they center on problems associated with complex natural scenes, including nonrigid motion, real sensor noise, and motion discontinuities. We propose a new set of benchmarks and evaluation methods for the next generation of optical flow algorithms. To that end, we contribute four types of data to test different aspects of optical flow algorithms: (1) sequences with nonrigid motion where the ground-truth flow is determined by A preliminary version of this paper appeared in the IEEE International Conference on Computer Vision (Baker et al. 2007).

translated by 谷歌翻译

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

分类：

Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today's best-performing stereo algorithms.

translated by 谷歌翻译

Embodied Hands: Modeling and Capturing Hands and Bodies Together

Javier Romero , Dimitrios Tzionas , Michael J. Black

分类：计算机视觉

2022-01-07

人类将他们的手和身体一起移动，沟通和解决任务。捕获和复制此类协调活动对于虚拟字符至关重要，以实际行为行为。令人惊讶的是，大多数方法分别对待身体和手的3D建模和跟踪。在这里，我们制定了一种手和身体的型号，并将其与全身4D序列合理。当扫描或捕获3D中的全身时，手很小，通常是部分闭塞，使其形状和难以恢复。为了应对低分辨率，闭塞和噪音，我们开发了一种名为Mano（具有铰接和非刚性变形的手模型）的新型号。曼诺从大约1000个高分辨率的3D扫描中学到了31个受试者的手中的大约一定的手。该模型是逼真的，低维，捕获非刚性形状的姿势变化，与标准图形封装兼容，可以适合任何人类的手。 Mano提供从手姿势的紧凑型映射，以构成混合形状校正和姿势协同效应的线性歧管。我们将Mano附加到标准参数化3D体形状模型（SMPL），导致完全铰接的身体和手部模型（SMPL + H）。我们通过用4D扫描仪捕获的综合体，自然，自然，自然的受试者的活动来说明SMPL + H.该配件完全自动，并导致全身型号，自然地移动详细的手动运动和在全身性能捕获之前未见的现实主义。模型和数据在我们的网站上自由用于研究目的（http://mano.is.tue.mpg.de）。

translated by 谷歌翻译

Parallel tracking and mapping for small AR workspaces

分类：

This paper presents a method of estimating camera pose in an unknown scene. While this has previously been attempted by adapting SLAM algorithms developed for robotic exploration, we propose a system specifically designed to track a hand-held camera in a small AR workspace. We propose to split tracking and mapping into two separate tasks, processed in parallel threads on a dual-core computer: one thread deals with the task of robustly tracking erratic hand-held motion, while the other produces a 3D map of point features from previously observed video frames. This allows the use of computationally expensive batch optimisation techniques not usually associated with real-time operation: The result is a system that produces detailed maps with thousands of landmarks which can be tracked at frame-rate, with an accuracy and robustness rivalling that of state-of-the-art model-based systems.

translated by 谷歌翻译

Multi-Hypothesis Scan Matching through Clustering

Giorgio Iavicoli , Claudio Zito

分类：机器人

2022-01-11

Graph-Slam是一种建立良好的算法，用于构建环境的拓扑图，同时尝试机器人的定位。它依赖于扫描匹配算法，以沿机器人的动作对齐嘈杂的观察，以计算当前机器人位置的估计。我们提出了一种基本上不同的方法来扫描匹配任务，以改善旋转转换位移的估计，从而提高完整的SLAM算法的性能。 Monte-Carlo方法用于生成两个扫描之间的几何位移的加权假设，然后我们纳入这些假设以计算导致最佳对准的位移。为了应对旋转转换的集群化，我们提出了一种新的聚类方法，通过将旋转翻译组件的内核密度分解内核密度来强大地扩展高斯平均转移到取向。我们在使用合成数据和英特尔研究实验室的基准数据集中展示了我们方法在广泛的实验中的有效性。结果证实，我们的方法在匹配的准确性和运行时计算方面具有卓越的性能，而不是基于最先进的基于迭代点的扫描匹配算法。

translated by 谷歌翻译

FaSS-MVS -- Fast Multi-View Stereo with Surface-Aware Semi-Global Matching from UAV-borne Monocular Imagery

Boitumelo Ruf , Martin Weinmann , Stefan Hinz

分类：计算机视觉

2021-12-01

使用FASS-MVS，我们提出了一种具有表面感知半全局匹配的快速多视图立体声的方法，其允许从UAV捕获的单眼航空视频数据中快速深度和正常地图估计。反过来，由FASS-MVS估计的数据促进在线3D映射，这意味着在获取或接收到图像数据时立即和递增地生成场景的3D地图。 FASS-MVS由分层处理方案组成，其中深度和正常数据以及相应的置信度分数以粗略的方式估计，允许有效地处理由倾斜图像所固有的大型场景深度低无人机。实际深度估计采用用于致密多图像匹配的平面扫描算法，以产生深度假设，通过表面感知半全局优化来提取实际深度图，从而减少了SGM的正平行偏压。给定估计的深度图，然后通过将深度图映射到点云中并计算狭窄的本地邻域内的普通向量来计算像素 - 方面正常信息。在彻底的定量和消融研究中，我们表明，由FASS-MV计算的3D信息的精度接近离线多视图立体声的最先进方法，误差甚至没有一个幅度而不是科麦。然而，同时，FASS-MVS的平均运行时间估计单个深度和正常地图的距离小于ColMAP的14％，允许在1-中执行全高清图像的在线和增量处理2 Hz。

translated by 谷歌翻译

Efficient Registration of Forest Point Clouds by Global Matching of Relative Stem Positions

Xufei Wang , Zexin Yang , Xiaojun Cheng , Jantien Stoter , Wenbin Xu , Zhenlun Wu , Liangliang Nan

分类：计算机视觉

2021-12-21

注册森林环境的点云是精密林业局部激光雷达应用的必要先决条件。最先进的森林点云登记方法需要提取单个树属性，并且在处理具有致密树的真实森林点云时，它们具有效率的瓶颈。我们提出了一种自动，坚固，高效的方法，用于登记森林点云。我们的方法首先定位树从原料点云茎，然后根据他们的相对空间关系确定准变换茎匹配。相较于现有的方法，我们的算法不需要额外的单株属性，具有线性复杂的环境中的树木数量，允许它的大森林环境对齐点云。广泛的实验表明，我们的方法优于关于登记精度和稳健性的最先进的方法，并且在效率方面显着优于现有技术。此外，我们引入一个新的基准数据集，补充的开发和注册方法评价森林点云的极少数现有的开放的数据集。

translated by 谷歌翻译

Enhanced Laser-Scan Matching with Online Error Estimation for Highway and Tunnel Driving

Matthew McDermott , Jason Rife

分类：机器人 | 计算机视觉

2022-07-29

LIDAR数据可用于生成点云，用于导航自动驾驶汽车或移动机器人平台。扫描匹配是估计最能使两个点云的刚性转换的过程，是LiDAR探射仪的基础，这是一种死亡估算的形式。当没有GPS（例如GPS）（例如GPS）（例如GPS）时，LIDAR的探光仪特别有用。在这里，我们提出了迭代最接近的椭圆形变换（ICET），这是一种扫描匹配算法，可对当前最新的正常分布变换（NDT）进行两种新颖的改进。像NDT一样，ICET将激光雷达数据分解为体素，并将高斯分布拟合到每个体素内的点。 ICET的第一个创新通过沿着这些方向抑制溶液来降低沿着大型平坦表面的几何歧义。 ICET的第二个创新是推断与连续点云之间的位置和方向转换相关的输出误差协方差；当将ICET纳入诸如扩展的卡尔曼滤波器之类的状态估计例程中时，误差协方差特别有用。我们构建了一个模拟，以比较有或没有几何歧义的2D空间中ICET和NDT的性能，并发现ICET产生了出色的估计值，同时可以准确预测溶液的准确性。

translated by 谷歌翻译

PatchMatch: A randomized correspondence algorithm for structural image editing

分类：

a) original (b) hole+constraints (c) hole filled (d) constraints (e) constrained retarget (f) reshuffleFigure 1: Structural image editing. Left to right: (a) the original image; (b) a hole is marked (magenta) and we use line constraints (red/green/blue) to improve the continuity of the roofline; (c) the hole is filled in; (d) user-supplied line constraints for retargeting;(e) retargeting using constraints eliminates two columns automatically; and (f) user translates the roof upward using reshuffling.

translated by 谷歌翻译

Point Cloud Registration of non-rigid objects in sparse 3D Scans with applications in Mixed Reality

Manorama Jha

分类：计算机视觉

2022-12-07

Point Cloud Registration is the problem of aligning the corresponding points of two 3D point clouds referring to the same object. The challenges include dealing with noise and partial match of real-world 3D scans. For non-rigid objects, there is an additional challenge of accounting for deformations in the object shape that happen to the object in between the two 3D scans. In this project, we study the problem of non-rigid point cloud registration for use cases in the Augmented/Mixed Reality domain. We focus our attention on a special class of non-rigid deformations that happen in rigid objects with parts that move relative to one another about joints, for example, robots with hands and machines with hinges. We propose an efficient and robust point-cloud registration workflow for such objects and evaluate it on real-world data collected using Microsoft Hololens 2, a leading Mixed Reality Platform.

translated by 谷歌翻译

SL Sensor: An Open-Source, ROS-Based, Real-Time Structured Light Sensor for High Accuracy Construction Robotic Applications

Teng Foong Lam , Hermann Blum , Roland Siegwart , Abel Gawel

分类：机器人

2022-01-22

许多施工机器人任务（例如自动水泥抛光或机器人石膏喷涂）需要高精度3D表面信息。但是，目前在市场上发现的消费级深度摄像头还不够准确，对于需要毫米（mm）级别准确性的这些任务。本文介绍了SL传感器，SL传感器是一种结构化的光传感溶液，能够通过利用相移初量法（PSP）编码技术来生产5 Hz的高保真点云。将SL传感器与两个商用深度摄像机进行了比较 - Azure Kinect和Realsense L515。实验表明，SL传感器以室内表面重建应用的精度和精度超过了两个设备。此外，为了证明SL传感器成为机器人应用的结构化光传感研究平台的能力，开发了运动补偿策略，该策略允许SL传感器在传统PSP方法仅在传感器静态时工作时在线性运动过程中运行。现场实验表明，SL传感器能够生成喷雾灰泥表面的高度详细的重建。机器人操作系统（ROS）的软件和SL传感器的示例硬件构建是开源的，其目的是使结构化的光传感更容易被施工机器人社区访问。所有文档和代码均可在https://github.com/ethz-asl/sl_sensor/上获得。

translated by 谷歌翻译