translated by 谷歌翻译
本文提出了一种新颖的方法,用于在具有复杂拓扑结构的地下领域的搜索和救援行动中自动合作。作为CTU-Cras-Norlab团队的一部分,拟议的系统在DARPA SubT决赛的虚拟轨道中排名第二。与专门为虚拟轨道开发的获奖解决方案相反,该建议的解决方案也被证明是在现实世界竞争极为严峻和狭窄的环境中飞行的机上实体无人机的强大系统。提出的方法可以使无缝模拟转移的无人机团队完全自主和分散的部署,并证明了其优于不同环境可飞行空间的移动UGV团队的优势。该论文的主要贡献存在于映射和导航管道中。映射方法采用新颖的地图表示形式 - 用于有效的风险意识长距离计划,面向覆盖范围和压缩的拓扑范围的LTVMAP领域,以允许在低频道通信下进行多机器人合作。这些表示形式与新的方法一起在导航中使用,以在一般的3D环境中可见性受限的知情搜索,而对环境结构没有任何假设,同时将深度探索与传感器覆盖的剥削保持平衡。所提出的解决方案还包括一条视觉感知管道,用于在没有专用GPU的情况下在5 Hz处进行四个RGB流中感兴趣的对象的板上检测和定位。除了参与DARPA SubT外,在定性和定量评估的各种环境中,在不同的环境中进行了广泛的实验验证,UAV系统的性能得到了支持。
translated by 谷歌翻译
由于温室环境中的较高变化和遮挡,机器人对番茄植物的视觉重建非常具有挑战性。 Active-Vision的范式通过推理先前获取的信息并系统地计划相机观点来收集有关植物的新信息,从而有助于克服这些挑战。但是,现有的主动视觉算法不能在有针对性的感知目标(例如叶子节点的3D重建)上表现良好,因为它们不能区分需要重建的植物零件和植物的其余部分。在本文中,我们提出了一种注意力驱动的主动视觉算法,该算法仅根据任务进行任务,仅考虑相关的植物零件。在模拟环境中评估了所提出的方法,该方法是针对番茄植物3D重建的任务,即各种关注水平,即整个植物,主茎和叶子节点。与预定义和随机方法相比,我们的方法将3D重建的准确性提高了9.7%和5.3%的整个植物的准确性,主茎的准确性为14.2%和7.9%,叶子源分别为25.9%和17.3%。前3个观点。同样,与预定义和随机方法相比,我们的方法重建了整个植物的80%和主茎,在1个较少的角度和80%的叶子节点中重建了3个较小的观点。我们还证明,尽管植物模型发生了变化,遮挡量,候选观点的数量和重建决议,但注意力驱动的NBV规划师仍有效地工作。通过在活动视觉上添加注意力机制,可以有效地重建整个植物和靶向植物部分。我们得出的结论是,有必要的注意机制对于显着提高复杂农业食品环境中的感知质量是必要的。
translated by 谷歌翻译
Figure 1: Example output from our system, generated in real-time with a handheld Kinect depth camera and no other sensing infrastructure. Normal maps (colour) and Phong-shaded renderings (greyscale) from our dense reconstruction system are shown. On the left for comparison is an example of the live, incomplete, and noisy data from the Kinect sensor (used as input to our system).
translated by 谷歌翻译
translated by 谷歌翻译
This paper presents trajectory planning for three-dimensional autonomous multi-UAV volume coverage and visual inspection based on the Heat Equation Driven Area Coverage (HEDAC) algorithm. The method designs a potential field to achieve the target density and generate trajectories using potential gradients to direct UAVs to regions of a higher potential. Collisions are prevented by implementing a distance field and correcting the agent's directional vector if the distance threshold is reached. The method is successfully tested for volume coverage and visual inspection of complex structures such as wind turbines and a bridge. For visual inspection, the algorithm is supplemented with camera direction control. A field containing the nearest distance from any point in the domain to the structure is designed and this field's gradient provides the camera orientation throughout the trajectory. The bridge inspection test case is compared with a state-of-the-art method where the HEDAC algorithm allowed more surface area to be inspected under the same conditions. The limitations of the HEDAC method are analyzed, focusing on computational efficiency and adequacy of spatial coverage to approximate the surface coverage. The proposed methodology offers flexibility in various setup parameters and is applicable to real-world inspection tasks.
translated by 谷歌翻译
translated by 谷歌翻译
Physically based rendering of complex scenes can be prohibitively costly with a potentially unbounded and uneven distribution of complexity across the rendered image. The goal of an ideal level of detail (LoD) method is to make rendering costs independent of the 3D scene complexity, while preserving the appearance of the scene. However, current prefiltering LoD methods are limited in the appearances they can support due to their reliance of approximate models and other heuristics. We propose the first comprehensive multi-scale LoD framework for prefiltering 3D environments with complex geometry and materials (e.g., the Disney BRDF), while maintaining the appearance with respect to the ray-traced reference. Using a multi-scale hierarchy of the scene, we perform a data-driven prefiltering step to obtain an appearance phase function and directional coverage mask at each scale. At the heart of our approach is a novel neural representation that encodes this information into a compact latent form that is easy to decode inside a physically based renderer. Once a scene is baked out, our method requires no original geometry, materials, or textures at render time. We demonstrate that our approach compares favorably to state-of-the-art prefiltering methods and achieves considerable savings in memory for complex scenes.
translated by 谷歌翻译
本文介绍了Cerberus机器人系统系统,该系统赢得了DARPA Subterranean挑战最终活动。出席机器人自主权。由于其几何复杂性,降解的感知条件以及缺乏GPS支持,严峻的导航条件和拒绝通信,地下设置使自动操作变得特别要求。为了应对这一挑战,我们开发了Cerberus系统,该系统利用了腿部和飞行机器人的协同作用,再加上可靠的控制,尤其是为了克服危险的地形,多模式和多机器人感知,以在传感器退化,以及在传感器退化的条件下进行映射以及映射通过统一的探索路径计划和本地运动计划,反映机器人特定限制的弹性自主权。 Cerberus基于其探索各种地下环境及其高级指挥和控制的能力,表现出有效的探索,对感兴趣的对象的可靠检测以及准确的映射。在本文中,我们报告了DARPA地下挑战赛的初步奔跑和最终奖项的结果,并讨论了为社区带来利益的教训所面临的亮点和挑战。
translated by 谷歌翻译
下一个最佳视图计算(NBV)是机器人技术中的长期问题,并包括确定下一个最有用的传感器位置,以有效,准确地重建3D对象或场景。像大多数当前方法一样,我们考虑了深度传感器的NBV预测。基于学习的方法依靠场景的体积表示适合路径规划,但与场景的大小相比,与使用基于表面的表示相比,相比,与场景的大小相比,准确性较低。但是,后者将相机限制为少量姿势。为了获得两种表示的优势,我们表明我们可以通过蒙特卡洛整合在体积表示上最大化表面指标。我们的方法会缩放到大型场景并处理自由相机运动:它需要输入一个任意的大点云,该点由LiDar Systems等深度传感器收集,以及相机姿势以预测NBV。我们在一个由大而复杂的3D场景制成的新型数据集上演示了我们的方法。
translated by 谷歌翻译
综合照片 - 现实图像和视频是计算机图形的核心,并且是几十年的研究焦点。传统上,使用渲染算法(如光栅化或射线跟踪)生成场景的合成图像,其将几何形状和材料属性的表示为输入。统称,这些输入定义了实际场景和呈现的内容,并且被称为场景表示(其中场景由一个或多个对象组成)。示例场景表示是具有附带纹理的三角形网格(例如,由艺术家创建),点云(例如,来自深度传感器),体积网格(例如,来自CT扫描)或隐式曲面函数(例如,截短的符号距离)字段)。使用可分辨率渲染损耗的观察结果的这种场景表示的重建被称为逆图形或反向渲染。神经渲染密切相关,并将思想与经典计算机图形和机器学习中的思想相结合,以创建用于合成来自真实观察图像的图像的算法。神经渲染是朝向合成照片现实图像和视频内容的目标的跨越。近年来,我们通过数百个出版物显示了这一领域的巨大进展,这些出版物显示了将被动组件注入渲染管道的不同方式。这种最先进的神经渲染进步的报告侧重于将经典渲染原则与学习的3D场景表示结合的方法,通常现在被称为神经场景表示。这些方法的一个关键优势在于它们是通过设计的3D-一致,使诸如新颖的视点合成捕获场景的应用。除了处理静态场景的方法外,我们还涵盖了用于建模非刚性变形对象的神经场景表示...
translated by 谷歌翻译
translated by 谷歌翻译
Intelligent mesh generation (IMG) refers to a technique to generate mesh by machine learning, which is a relatively new and promising research field. Within its short life span, IMG has greatly expanded the generalizability and practicality of mesh generation techniques and brought many breakthroughs and potential possibilities for mesh generation. However, there is a lack of surveys focusing on IMG methods covering recent works. In this paper, we are committed to a systematic and comprehensive survey describing the contemporary IMG landscape. Focusing on 110 preliminary IMG methods, we conducted an in-depth analysis and evaluation from multiple perspectives, including the core technique and application scope of the algorithm, agent learning goals, data types, targeting challenges, advantages and limitations. With the aim of literature collection and classification based on content extraction, we propose three different taxonomies from three views of key technique, output mesh unit element, and applicable input data types. Finally, we highlight some promising future research directions and challenges in IMG. To maximize the convenience of readers, a project page of IMG is provided at \url{https://github.com/xzb030/IMG_Survey}.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
我们提出了一种新颖的方法,以基于在线RGBD重建与语义分割的在线RGBD重建,提出了一种对未知的室内场景的机器人工作的主动理解。在我们的方法中,探索机器人扫描是由场景中语义对象的识别和分割的驱动和定位。我们的算法基于体积深度融合框架(例如,KinectFusion)之上,并在在线重建卷上执行实时Voxel的语义标记。机器人通过在2D位置和方位角旋转的3D空间上参数化的在线估计的离散观看截由场(VSF)。 VSF为每个网格存储相应视图的分数,测量它减少了几何重建和语义标记的不确定性(熵)。基于VSF,我们选择每个时间步骤的下一个最佳视图(NBV)作为目标。然后,我们通过沿路径和轨迹最大化积分观看分数(信息增益)来共同优化遍历两个相邻的NBV之间的横向路径和相机轨迹。通过广泛的评估,我们表明我们的方法在探索性扫描期间实现了高效准确的在线场景解析。
translated by 谷歌翻译
translated by 谷歌翻译
Tendon-driven robots, where one or more tendons under tension bend and manipulate a flexible backbone, can improve minimally invasive surgeries involving difficult-to-reach regions in the human body. Planning motions safely within constrained anatomical environments requires accuracy and efficiency in shape estimation and collision checking. Tendon robots that employ arbitrarily-routed tendons can achieve complex and interesting shapes, enabling them to travel to difficult-to-reach anatomical regions. Arbitrarily-routed tendon-driven robots have unintuitive nonlinear kinematics. Therefore, we envision clinicians leveraging an assistive interactive-rate motion planner to automatically generate collision-free trajectories to clinician-specified destinations during minimally-invasive surgical procedures. Standard motion-planning techniques cannot achieve interactive-rate motion planning with the current expensive tendon robot kinematic models. In this work, we present a 3-phase motion-planning system for arbitrarily-routed tendon-driven robots with a Precompute phase, a Load phase, and a Supervisory Control phase. Our system achieves an interactive rate by developing a fast kinematic model (over 1,000 times faster than current models), a fast voxel collision method (27.6 times faster than standard methods), and leveraging a precomputed roadmap of the entire robot workspace with pre-voxelized vertices and edges. In simulated experiments, we show that our motion-planning method achieves high tip-position accuracy and generates plans at 14.8 Hz on average in a segmented collapsed lung pleural space anatomical environment. Our results show that our method is 17,700 times faster than popular off-the-shelf motion planning algorithms with standard FK and collision detection approaches. Our open-source code is available online.
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译