Autonomous underwater vehicles (AUVs) are becoming standard tools for underwater exploration and seabed mapping in both scientific and industrial applications \cite{graham2022rapid, stenius2022system}. Their capacity to dive untethered allows them to reach areas inaccessible to surface vessels and to collect data more closely to the seafloor, regardless of the water depth. However, their navigation autonomy remains bounded by the accuracy of their dead reckoning (DR) estimate of their global position, severely limited in the absence of a priori maps of the area and GPS signal. Global localization systems equivalent to the later exists for the underwater domain, such as LBL or USBL. However they involve expensive external infrastructure and their reliability decreases with the distance to the AUV, making them unsuitable for deep sea surveys.
translated by 谷歌翻译
自主导航的同时本地化和映射(SLAM)框架依赖于强大的数据关联来识别循环封闭以进行后端轨迹优化。对于配备了多层回声器(MBE)的自动水下车辆(AUV),由于海床中可识别的地标的稀缺性,数据关联尤其具有挑战性MBE数据的低分辨率特征。循环封闭检测的深度学习解决方案已显示出来自更结构化环境的数据的出色性能。但是,它们转移到海底领域并不是直接的,并且由于缺乏测深的数据集而阻碍了移植它们的努力。因此,在本文中,我们提出了一种神经网络体系结构,旨在展示将这种技术适应测深数据中对应匹配的潜力。我们从AUV任务中训练我们的框架,并评估其在循环闭合检测任务和粗点云对齐任务上的性能。最后,我们在更传统的方法上展示了其潜力,并释放其实现和所使用的数据集。
translated by 谷歌翻译
侧can声纳是一种小型且具有成本效益的传感溶液,可以轻松地安装在大多数船上。从历史上看,它一直用于生产高清图像,专家可能用来识别海底或水柱上的目标。虽然已提出溶液仅从侧扫或与Multibeam结合使用,但影响有限。这部分是由于主要仅限于单侧扫描线的结果。在本文中,我们提出了一种现代可口的解决方案,以从许多侧扫线中创建高质量的测量规模测深。通过合并对同一位置的多个观察结果,可以改善结果,因为估计值相互加强。我们的方法基于正弦表示网络,这是神经表示学习的最新进展。我们通过从大型侧扫调查中产生测深,证明了该方法的可伸缩性。通过与高精度多光束传感器收集的数据进行比较,可以证明所得的质量。
translated by 谷歌翻译
可区分渲染的最新进展,可以将相对于3D对象模型计算2D像素值的梯度,可以通过仅在2D监督下通过基于梯度的优化来估计模型参数。将深度神经网络纳入这样的优化管道很容易,从而可以利用深度学习技术。这也大大减少了收集和注释3D数据的要求,例如,在2D传感器构造几何形状时,这对于应用程序非常困难。在这项工作中,我们为侧can声纳图像提出了一个可区分的渲染器。我们进一步证明了它可以解决仅从2D侧can声纳数据直接重建3D海底网眼的反问题的能力。
translated by 谷歌翻译
侧扫声纳强度编码有关海床表面正常变化的信息。但是,其他因素(例如海底几何形状及其材料组成)也会影响回流强度。可以建模这些强度从向前方向上的变化从从测深图和物理特性到测量强度的表面正常的变化,或者可以使用逆模型,该模型从强度开始并模拟表面正常。在这里,我们使用一个逆模型,该模型利用深度学习能够从数据中学习的能力;卷积神经网络用于估计侧扫的正常表面。因此,海床的内部特性仅是隐式学习的。一旦估算了此信息,就可以通过优化框架重建测深图,该框架还包括高度计读数,以提供稀疏的深度轮廓作为约束。最近提出了隐式神经表示学习,以代表这种优化框架中的测深图。在本文中,我们使用神经网络来表示地图并在高度计点的约束和侧can的估计表面正常状态下进行优化。通过从几个侧扫线的不同角度融合多个观测值,通过优化改善了估计的结果。我们通过使用大型侧扫调查的侧扫数据重建高质量的测深,通过重建高质量的测深,证明了该方法的效率和可伸缩性。我们比较了提出的数据驱动的逆模型方法,该方法将侧扫形成前向兰伯特模型。我们通过将每个重建的质量与由多光束传感器构建的数据进行比较来评估它的质量。因此,我们能够讨论每种方法的优点和缺点。
translated by 谷歌翻译
我们提出了一种新型的数据驱动方法,用于从侧扫而言高分辨率测深的重建。侧面声纳(SSS)强度随范围的函数确实包含有关海底斜率的一些信息。但是,必须推断该信息。此外,导航系统提供了估计的轨迹,通常也可以使用沿该轨迹的高度。通过这些,我们获得了非常粗糙的海床测深,作为输入。然后将其与从侧扫的间接但高分辨率的海床信息结合在一起,以估计完整的测深。这个稀疏的深度可以通过单光束回声声音,多普勒速度日志(DVL),其他底部跟踪传感器或底部跟踪算法从侧can本身获得。在我们的工作中,使用一个完全卷积的网络来估算侧扫图像中的深度轮廓及其不确定性,并以端到端的方式稀疏深度。然后将估计的深度与范围一起使用,以计算海底上点的3D位置。可以在融合深度预测和来自神经网络的相应置信度度量后重建高质量的测深图。我们显示了通过使用侧扫而言,仅与侧扫相比,通过使用侧扫而获得的稀疏深度获得了测得图的改进。当将多个测深估计值融合到单个地图中时,我们还显示了置信度加权的好处。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
In this paper we derive a PAC-Bayesian-Like error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian-Like error bound for such systems, 3) discuss various consequences of this error bound.
translated by 谷歌翻译