Autonomous vehicle (AV) stacks are typically built in a modular fashion, with explicit components performing detection, tracking, prediction, planning, control, etc. While modularity improves reusability, interpretability, and generalizability, it also suffers from compounding errors, information bottlenecks, and integration challenges. To overcome these challenges, a prominent approach is to convert the AV stack into an end-to-end neural network and train it with data. While such approaches have achieved impressive results, they typically lack interpretability and reusability, and they eschew principled analytical components, such as planning and control, in favor of deep neural networks. To enable the joint optimization of AV stacks while retaining modularity, we present DiffStack, a differentiable and modular stack for prediction, planning, and control. Crucially, our model-based planning and control algorithms leverage recent advancements in differentiable optimization to produce gradients, enabling optimization of upstream components, such as prediction, via backpropagation through planning and control. Our results on the nuScenes dataset indicate that end-to-end training with DiffStack yields substantial improvements in open-loop and closed-loop planning metrics by, e.g., learning to make fewer prediction errors that would affect planning. Beyond these immediate benefits, DiffStack opens up new opportunities for fully data-driven yet modular and interpretable AV architectures. Project website: https://sites.google.com/view/diffstack
translated by 谷歌翻译
相应地预测周围交通参与者的未来状态,并计划安全,平稳且符合社会的轨迹对于自动驾驶汽车至关重要。当前的自主驾驶系统有两个主要问题:预测模块通常与计划模块解耦,并且计划的成本功能很难指定和调整。为了解决这些问题,我们提出了一个端到端的可区分框架,该框架集成了预测和计划模块,并能够从数据中学习成本函数。具体而言,我们采用可区分的非线性优化器作为运动计划者,该运动计划将神经网络给出的周围剂的预测轨迹作为输入,并优化了自动驾驶汽车的轨迹,从而使框架中的所有操作都可以在框架中具有可观的成本,包括成本功能权重。提出的框架经过大规模的现实驾驶数据集进行了训练,以模仿整个驾驶场景中的人类驾驶轨迹,并在开环和闭环界面中进行了验证。开环测试结果表明,所提出的方法的表现优于各种指标的基线方法,并提供以计划为中心的预测结果,从而使计划模块能够输出接近人类的轨迹。在闭环测试中,提出的方法表明能够处理复杂的城市驾驶场景和鲁棒性,以抵抗模仿学习方法所遭受的分配转移。重要的是,我们发现计划和预测模块的联合培训比在开环和闭环测试中使用单独的训练有素的预测模块进行计划要比计划更好。此外,消融研究表明,框架中的可学习组件对于确保计划稳定性和性能至关重要。
translated by 谷歌翻译
Making safe and human-like decisions is an essential capability of autonomous driving systems and learning-based behavior planning is a promising pathway toward this objective. Distinguished from existing learning-based methods that directly output decisions, this work introduces a predictive behavior planning framework that learns to predict and evaluate from human driving data. Concretely, a behavior generation module first produces a diverse set of candidate behaviors in the form of trajectory proposals. Then the proposed conditional motion prediction network is employed to forecast other agents' future trajectories conditioned on each trajectory proposal. Given the candidate plans and associated prediction results, we learn a scoring module to evaluate the plans using maximum entropy inverse reinforcement learning (IRL). We conduct comprehensive experiments to validate the proposed framework on a large-scale real-world urban driving dataset. The results reveal that the conditional prediction model is able to forecast multiple possible future trajectories given a candidate behavior and the prediction results are reactive to different plans. Moreover, the IRL-based scoring module can properly evaluate the trajectory proposals and select close-to-human ones. The proposed framework outperforms other baseline methods in terms of similarity to human driving trajectories. Moreover, we find that the conditional prediction model can improve both prediction and planning performance compared to the non-conditional model, and learning the scoring module is critical to correctly evaluating the candidate plans to align with human drivers.
translated by 谷歌翻译
仿真是对机器人系统(例如自动驾驶汽车)进行扩展验证和验证的关键。尽管高保真物理和传感器模拟取得了进步,但在模拟道路使用者的现实行为方面仍然存在一个危险的差距。这是因为,与模拟物理和图形不同,设计人类行为的第一个原理模型通常是不可行的。在这项工作中,我们采用了一种数据驱动的方法,并提出了一种可以学会从现实世界驱动日志中产生流量行为的方法。该方法通过将交通仿真问题分解为高级意图推理和低级驾驶行为模仿,通过利用驾驶行为的双层层次结构来实现高样本效率和行为多样性。该方法还结合了一个计划模块,以获得稳定的长马行为。我们从经验上验证了我们的方法,即交通模拟(位)的双层模仿,并具有来自两个大规模驾驶数据集的场景,并表明位表明,在现实主义,多样性和长途稳定性方面可以达到平衡的交通模拟性能。我们还探索了评估行为现实主义的方法,并引入了一套评估指标以进行交通模拟。最后,作为我们的核心贡献的一部分,我们开发和开源一个软件工具,该工具将跨不同驱动数据集的数据格式统一,并将现有数据集将场景转换为交互式仿真环境。有关其他信息和视频,请参见https://sites.google.com/view/nvr-bits2022/home
translated by 谷歌翻译
自治车辆的评估和改善规划需要可扩展的长尾交通方案。有用的是,这些情景必须是现实的和挑战性的,但不能安全地开车。在这项工作中,我们介绍努力,一种自动生成具有挑战性的场景的方法,导致给定的计划者产生不良行为,如冲突。为了维护情景合理性,关键的想法是利用基于图形的条件VAE的形式利用学习的交通运动模型。方案生成在该流量模型的潜在空间中制定了优化,通过扰乱初始的真实世界的场景来产生与给定计划者碰撞的轨迹。随后的优化用于找到“解决方案”的场景,确保改进给定的计划者是有用的。进一步的分析基于碰撞类型的群集生成的场景。我们攻击两名策划者并展示争取在这两种情况下成功地产生了现实,具有挑战性的情景。我们另外“关闭循环”并使用这些方案优化基于规则的策划器的超参数。
translated by 谷歌翻译
在这项工作中,我们提出了世界上第一个基于闭环ML的自动驾驶计划基准。虽然存在基于ML的ML的越来越多的ML的议员,但缺乏已建立的数据集和指标限制了该领域的进展。自主车辆运动预测的现有基准专注于短期运动预测,而不是长期规划。这导致了以前的作品来使用基于L2的度量标准的开放循环评估,这不适合公平地评估长期规划。我们的基准通过引入大规模驾驶数据集,轻量级闭环模拟器和特定于运动规划的指标来克服这些限制。我们提供高质量的数据集,在美国和亚洲的4个城市提供1500h的人类驾驶数据,具有广泛不同的交通模式(波士顿,匹兹堡,拉斯维加斯和新加坡)。我们将提供具有无功代理的闭环仿真框架,并提供一系列一般和方案特定的规划指标。我们计划在Neurips 2021上发布数据集,并在2022年初开始组织基准挑战。
translated by 谷歌翻译
轨迹预测对于自动驾驶汽车(AV)是必不可少的,以计划正确且安全的驾驶行为。尽管许多先前的作品旨在达到更高的预测准确性,但很少有人研究其方法的对抗性鲁棒性。为了弥合这一差距,我们建议研究数据驱动的轨迹预测系统的对抗性鲁棒性。我们设计了一个基于优化的对抗攻击框架,该框架利用精心设计的可区分动态模型来生成逼真的对抗轨迹。从经验上讲,我们基于最先进的预测模型的对抗性鲁棒性,并表明我们的攻击使通用指标和计划感知指标的预测错误增加了50%以上和37%。我们还表明,我们的攻击可以导致AV在模拟中驶离道路或碰撞到其他车辆中。最后,我们演示了如何使用对抗训练计划来减轻对抗性攻击。
translated by 谷歌翻译
The goal of autonomous vehicles is to navigate public roads safely and comfortably. To enforce safety, traditional planning approaches rely on handcrafted rules to generate trajectories. Machine learning-based systems, on the other hand, scale with data and are able to learn more complex behaviors. However, they often ignore that agents and self-driving vehicle trajectory distributions can be leveraged to improve safety. In this paper, we propose modeling a distribution over multiple future trajectories for both the self-driving vehicle and other road agents, using a unified neural network architecture for prediction and planning. During inference, we select the planning trajectory that minimizes a cost taking into account safety and the predicted probabilities. Our approach does not depend on any rule-based planners for trajectory generation or optimization, improves with more training data and is simple to implement. We extensively evaluate our method through a realistic simulator and show that the predicted trajectory distribution corresponds to different driving profiles. We also successfully deploy it on a self-driving vehicle on urban public roads, confirming that it drives safely without compromising comfort. The code for training and testing our model on a public prediction dataset and the video of the road test are available at https://woven.mobi/safepathnet
translated by 谷歌翻译
在本文中,我们提出了一个系统,以培训不仅从自我车辆收集的经验,而且还观察到的所有车辆的经验。该系统使用其他代理的行为来创建更多样化的驾驶场景,而无需收集其他数据。从其他车辆学习的主要困难是没有传感器信息。我们使用一组监督任务来学习一个中间表示,这是对控制车辆的观点不变的。这不仅在训练时间提供了更丰富的信号,而且还可以在推断过程中进行更复杂的推理。了解所有车辆驾驶如何有助于预测测试时的行为,并避免碰撞。我们在闭环驾驶模拟中评估该系统。我们的系统的表现优于公共卡拉排行榜上的所有先前方法,较大的利润率将驾驶得分提高了25,路线完成率提高了24分。我们的方法赢得了2021年的卡拉自动驾驶挑战。代码和数据可在https://github.com/dotchen/lav上获得。
translated by 谷歌翻译
Accurately predicting interactive road agents' future trajectories and planning a socially compliant and human-like trajectory accordingly are important for autonomous vehicles. In this paper, we propose a planning-centric prediction neural network, which takes surrounding agents' historical states and map context information as input, and outputs the joint multi-modal prediction trajectories for surrounding agents, as well as a sequence of control commands for the ego vehicle by imitation learning. An agent-agent interaction module along the time axis is proposed in our network architecture to better comprehend the relationship among all the other intelligent agents on the road. To incorporate the map's topological information, a Dynamic Graph Convolutional Neural Network (DGCNN) is employed to process the road network topology. Besides, the whole architecture can serve as a backbone for the Differentiable Integrated motion Prediction with Planning (DIPP) method by providing accurate prediction results and initial planning commands. Experiments are conducted on real-world datasets to demonstrate the improvements made by our proposed method in both planning and prediction accuracy compared to the previous state-of-the-art methods.
translated by 谷歌翻译
ML-based motion planning is a promising approach to produce agents that exhibit complex behaviors, and automatically adapt to novel environments. In the context of autonomous driving, it is common to treat all available training data equally. However, this approach produces agents that do not perform robustly in safety-critical settings, an issue that cannot be addressed by simply adding more data to the training set - we show that an agent trained using only a 10% subset of the data performs just as well as an agent trained on the entire dataset. We present a method to predict the inherent difficulty of a driving situation given data collected from a fleet of autonomous vehicles deployed on public roads. We then demonstrate that this difficulty score can be used in a zero-shot transfer to generate curricula for an imitation-learning based planning agent. Compared to training on the entire unbiased training dataset, we show that prioritizing difficult driving scenarios both reduces collisions by 15% and increases route adherence by 14% in closed-loop evaluation, all while using only 10% of the training data.
translated by 谷歌翻译
在现代自治堆栈中,预测模块对于在其他移动代理的存在下计划动作至关重要。但是,预测模块的失败会误导下游规划师做出不安全的决定。确实,轨迹预测任务固有的高度不确定性可确保这种错误预测经常发生。由于需要提高自动驾驶汽车的安全而不受损害其性能的需求,我们开发了一个概率运行时监视器,该监视器检测到何时发生“有害”预测故障,即与任务相关的失败检测器。我们通过将轨迹预测错误传播到计划成本来推理其对AV的影响来实现这一目标。此外,我们的检测器还配备了假阳性和假阴性速率的性能度量,并允许进行无数据校准。在我们的实验中,我们将检测器与其他各种检测器进行了比较,发现我们的检测器在接收器操作员特征曲线下具有最高的面积。
translated by 谷歌翻译
With the development of deep representation learning, the domain of reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. This review summarises deep reinforcement learning (DRL) algorithms and provides a taxonomy of automated driving tasks where (D)RL methods have been employed, while addressing key computational challenges in real world deployment of autonomous driving agents. It also delineates adjacent domains such as behavior cloning, imitation learning, inverse reinforcement learning that are related but are not classical RL algorithms. The role of simulators in training agents, methods to validate, test and robustify existing solutions in RL are discussed.
translated by 谷歌翻译
Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-ofthe-art deterministic and generative methods.
translated by 谷歌翻译
基于神经网络的驾驶规划师在改善自动驾驶的任务绩效方面表现出了巨大的承诺。但是,确保具有基于神经网络的组件的系统的安全性,尤其是在密集且高度交互式的交通环境中,这是至关重要的,但又具有挑战性。在这项工作中,我们为基于神经网络的车道更改提出了一个安全驱动的互动计划框架。为了防止过度保守计划,我们确定周围车辆的驾驶行为并评估其侵略性,然后以互动方式相应地适应了计划的轨迹。如果在预测的最坏情况下,即使存在安全的逃避轨迹,则自我车辆可以继续改变车道;否则,它可以停留在当前的横向位置附近或返回原始车道。我们通过广泛而全面的实验环境以及在自动驾驶汽车公司收集的现实情况下进行了广泛的模拟,定量证明了计划者设计的有效性及其优于基线方法的优势。
translated by 谷歌翻译
我们解决了由具有不同驱动程序行为的道路代理人填充的密集模拟交通环境中的自我车辆导航问题。由于其异构行为引起的代理人的不可预测性,这种环境中的导航是挑战。我们提出了一种新的仿真技术,包括丰富现有的交通模拟器,其具有与不同程度的侵略性程度相对应的行为丰富的轨迹。我们在驾驶员行为建模算法的帮助下生成这些轨迹。然后,我们使用丰富的模拟器培训深度加强学习(DRL)策略,包括一组高级车辆控制命令,并在测试时间使用此策略来执行密集流量的本地导航。我们的政策隐含地模拟了交通代理商之间的交互,并计算了自助式驾驶员机动,例如超速,超速,编织和突然道路变化的激进驾驶员演习的安全轨迹。我们增强的行为丰富的模拟器可用于生成由对应于不同驱动程序行为和流量密度的轨迹组成的数据集,我们的行为的导航方案可以与最先进的导航算法相结合。
translated by 谷歌翻译
离线强化学习(RL)为从离线数据提供学习决策的框架,因此构成了现实世界应用程序作为自动驾驶的有希望的方法。自动驾驶车辆(SDV)学习策略,这甚至可能甚至优于次优数据集中的行为。特别是在安全关键应用中,作为自动化驾驶,解释性和可转换性是成功的关键。这激发了使用基于模型的离线RL方法,该方法利用规划。然而,目前的最先进的方法往往忽视了多种子体系统随机行为引起的溶液不确定性的影响。这项工作提出了一种新的基于不确定感知模型的离线强化学习利用规划(伞)的新方法,其解决了以可解释的基于学习的方式共同的预测,规划和控制问题。训练有素的动作调节的随机动力学模型捕获了交通场景的独特不同的未来演化。分析为我们在挑战自动化驾驶模拟中的效力和基于现实世界的公共数据集的方法提供了经验证据。
translated by 谷歌翻译
为了确保用户接受自动驾驶汽车(AVS),正在开发控制系统以模仿人类驾驶员的驾驶行为。模仿学习(IL)算法达到了这个目的,但努力为由此产生的闭环系统轨迹提供安全保证。另一方面,模型预测控制(MPC)可以处理具有安全限制的非线性系统,但是用它来实现类似人类的驾驶需要广泛的域知识。这项工作表明,通过将MPC用作分层IL策略中的可区分控制层,将两种技术的无缝组合从所需驾驶行为的演示中学习安全的AV控制器。通过此策略,IL通过MPC成本,模型或约束的参数在闭环和端到端进行。鉴于人类在固定基准驾驶模拟器上进行了示范,分析了通过行为克隆(BCO)来学习的该方法的实验结果,用于通过行为克隆(BCO)学习的车道控制系统的设计。
translated by 谷歌翻译
Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction and planning. As sensors and hardware get improved, there is trending popularity to devise a system that can perform a wide diversity of tasks to fulfill higher-level intelligence. Contemporary approaches resort to either deploying standalone models for individual tasks, or designing a multi-task paradigm with separate heads. These might suffer from accumulative error or negative transfer effect. Instead, we argue that a favorable algorithm framework should be devised and optimized in pursuit of the ultimate goal, i.e. planning of the self-driving-car. Oriented at this goal, we revisit the key components within perception and prediction. We analyze each module and prioritize the tasks hierarchically, such that all these tasks contribute to planning (the goal). To this end, we introduce Unified Autonomous Driving (UniAD), the first comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query design to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven to surpass previous state-of-the-arts by a large margin in all aspects. The full suite of codebase and models would be available to facilitate future research in the community.
translated by 谷歌翻译
许多现有的自动驾驶范式涉及多个任务的多个阶段离散管道。为了更好地预测控制信号并增强用户安全性,希望从联合时空特征学习中受益的端到端方法是可取的。尽管基于激光雷达的输入或隐式设计有一些开创性的作品,但在本文中,我们在可解释的基于视觉的设置中提出了问题。特别是,我们提出了一种空间性特征学习方案,以同时同时进行感知,预测和计划任务的一组更具代表性的特征,称为ST-P3。具体而言,提出了一种以自我为中心的积累技术来保留3D空间中的几何信息,然后才能感知鸟类视图转化。设计了双重途径建模,以考虑将来的预测,以将过去的运动变化考虑到过去。引入了基于时间的精炼单元,以弥补识别基于视觉的计划的元素。据我们所知,我们是第一个系统地研究基于端视力的自主驾驶系统的每个部分。我们在开环Nuscenes数据集和闭环CARLA模拟上对以前的最先进的方法进行基准测试。结果显示了我们方法的有效性。源代码,模型和协议详细信息可在https://github.com/openperceptionx/st-p3上公开获得。
translated by 谷歌翻译