In the field of autonomous robots, reinforcement learning (RL) is an increasingly used method to solve the task of dynamic obstacle avoidance for mobile robots, autonomous ships, and drones. A common practice to train those agents is to use a training environment with random initialization of agent and obstacles. Such approaches might suffer from a low coverage of high-risk scenarios in training, leading to impaired final performance of obstacle avoidance. This paper proposes a general training environment where we gain control over the difficulty of the obstacle avoidance task by using short training episodes and assessing the difficulty by two metrics: The number of obstacles and a collision risk metric. We found that shifting the training towards a greater task difficulty can massively increase the final performance. A baseline agent, using a traditional training environment based on random initialization of agent and obstacles and longer training episodes, leads to a significantly weaker performance. To prove the generalizability of the proposed approach, we designed two realistic use cases: A mobile robot and a maritime ship under the threat of approaching obstacles. In both applications, the previous results can be confirmed, which emphasizes the general usability of the proposed approach, detached from a specific application context and independent of the agent's dynamics. We further added Gaussian noise to the sensor signals, resulting in only a marginal degradation of performance and thus indicating solid robustness of the trained agent.
translated by 谷歌翻译
虽然在过去几年中,越来越多地应用了深入的增强学习(RL),但该研究旨在研究基于RL的车辆辅助对复杂的车辆动力学和强烈的环境干扰的可行性。作为用例,我们开发了一种基于逼真的容器动力学的内陆水道跟随模型,该模型考虑了环境影响,例如变化的河流速度和河流剖面。我们从匿名的AIS数据中提取了自然血管行为,以制定奖励功能,该奖励功能反映了舒适且安全的导航旁边的现实驾驶方式。针对高概括能力,我们提出了一个RL训练环境,该环境使用随机过程来建模领先的轨迹和河流动力学。为了验证训练有素的模型,我们定义了在训练中尚未看到的不同情况,包括在中间莱茵河上逼真的船只。我们的模型在所有情况下都表现出安全舒适的驾驶,证明了出色的概括能力。此外,通过在一系列船只上部署训练的模型,可以有效地抑制交通振荡。
translated by 谷歌翻译
基于价值的增强学习算法在游戏,机器人技术和其他现实世界应用中表现出了很强的性能。最受欢迎的基于样本的方法是$ q $ - 学习。随后,它通过将当前$ Q $ estimate调整为观察到的奖励和下一个状态的$ Q $估计值来执行更新。该过程引入了最大化偏置,其方法是Double $ Q $ - 学习。我们从统计上构架了偏置问题,并认为它是估计一组随机变量的最大期望值(MEV)的实例。我们根据平均值的两样本测试提出了$ t $估计器(TE),该测试通过调整基本假设检验的显着性水平来灵活地插入过度和低估之间。称为$ k $ estimator(KE)的概括,在依靠几乎任意的内核函数的同时,遵守与TE相同的偏差和差异界限。我们使用TE和KE介绍了$ Q $ - 学习的修改和引导Deep $ Q $ -NETWORK(BDQN)。此外,我们提出了基于TE的BDQN的自适应变体,该变体会动态调整显着性水平,以最大程度地减少绝对估计偏置。所有提出的估计器和算法均经过彻底的测试和验证,并在不同的任务和环境上进行了验证,以说明TE和KE的偏见控制和性能潜力。
translated by 谷歌翻译
在自主驾驶场中,人类知识融合到深增强学习(DRL)通常基于在模拟环境中记录的人类示范。这限制了在现实世界交通中的概率和可行性。我们提出了一种两级DRL方法,从真实的人类驾驶中学习,实现优于纯DRL代理的性能。培训DRL代理商是在Carla的框架内完成了机器人操作系统(ROS)。对于评估,我们设计了不同的真实驾驶场景,可以将提出的两级DRL代理与纯DRL代理进行比较。在从人驾驶员中提取“良好”行为之后,例如在信号交叉口中的预期,该代理变得更有效,并且驱动更安全,这使得这种自主代理更适应人体机器人交互(HRI)流量。
translated by 谷歌翻译
通过定义具有可变复杂性的流量类型独立环境,基于深度加强学习,介绍一种新的动态障碍避免方法。在当前文献中填补了差距,我们彻底调查了缺失速度信息对代理商在避免任务中的性能的影响。这是实践中至关重要的问题,因为几个传感器仅产生物体或车辆的位置信息。我们在部分可观察性方面评估频繁应用的方法,即在深神经网络中的复发性并简单帧堆叠。为我们的分析,我们依靠最先进的无模型深射RL算法。发现速度信息缺乏影响代理商的性能。两种方法 - 复发性和帧堆叠 - 不能在观察空间中一致地替换缺失的速度信息。但是,在简化的情况下,它们可以显着提高性能并稳定整体培训程序。
translated by 谷歌翻译
$ $With recent advances in CNNs, exceptional improvements have been made in semantic segmentation of high resolution images in terms of accuracy and latency. However, challenges still remain in detecting objects in crowded scenes, large scale variations, partial occlusion, and distortions, while still maintaining mobility and latency. We introduce a fast and efficient convolutional neural network, ASBU-Net, for semantic segmentation of high resolution images that addresses these problems and uses no novelty layers for ease of quantization and embedded hardware support. ASBU-Net is based on a new feature extraction module, atrous space bender layer (ASBL), which is efficient in terms of computation and memory. The ASB layers form a building block that is used to make ASBNet. Since this network does not use any special layers it can be easily implemented, quantized and deployed on FPGAs and other hardware with limited memory. We present experiments on resource and accuracy trade-offs and show strong performance compared to other popular models.
translated by 谷歌翻译