智能论文笔记

Collision probability reduction method for tracking control in automatic docking / berthing using reinforcement learning

Kouki Wakita , Youhei Akimoto , Dimas M. Rachman , Yoshiki Miyauchi , Umeda Naoya , Atsuo Maki

分类：机器人

2022-12-13

Automation of berthing maneuvers in shipping is a pressing issue as the berthing maneuver is one of the most stressful tasks seafarers undertake. Berthing control problems are often tackled via tracking a predefined trajectory or path. Maintaining a tracking error of zero under an uncertain environment is impossible; the tracking controller is nonetheless required to bring vessels close to desired berths. The tracking controller must prioritize the avoidance of tracking errors that may cause collisions with obstacles. This paper proposes a training method based on reinforcement learning for a trajectory tracking controller that reduces the probability of collisions with static obstacles. Via numerical simulations, we show that the proposed method reduces the probability of collisions during berthing maneuvers. Furthermore, this paper shows the tracking performance in a model experiment.

translated by 谷歌翻译

Risk-based implementation of COLREGs for autonomous surface vehicles using deep reinforcement learning

Thomas Nakken Larsen , Amalie Heiberg , Eivind Meyer , Adil Rasheeda , Omer San , Damiano Varagnolo

分类：机器人 | 人工智能

2021-11-30

自治系统正在成为海洋部门内无处不在和获得势头。由于运输的电气化同时发生，自主海洋船只可以降低环境影响，降低成本并提高效率。虽然仍然需要密切的监控以确保安全，但最终目标是完全自主权。一个主要的里程碑是开发一个控制系统，这足以处理任何也稳健和可靠的天气和遇到。此外，控制系统必须遵守防止海上碰撞的国际法规，以便与人类水手进行成功互动。由于Colregs被编写为人类思想来解释，因此它们以暧昧的散文写成，因此不能获得机器可读或可核实。由于这些挑战和各种情况进行了解决，古典模型的方法证明了实现和计算沉重的复杂性。在机器学习（ML）内，深增强学习（DRL）对广泛的应用表现出了很大的潜力。 DRL的无模型和自学特性使其成为自治船只的有希望的候选人。在这项工作中，使用碰撞风险理论将Colregs的子集合在于基于DRL的路径和障碍物避免系统。由此产生的自主代理在训练场景中的训练场景，孤立的遇难情况和基于AIS的真实情景模拟中动态地插值。

translated by 谷歌翻译

Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning

Jueming Hu , Xuxi Yang , Weichang Wang , Peng Wei , Lei Ying , Yongming Liu

分类：机器人 | 人工智能

2021-11-13

小型无人驾驶飞机的障碍避免对于未来城市空袭（UAM）和无人机系统（UAS）交通管理（UTM）的安全性至关重要。有许多技术用于实时强大的无人机指导，但其中许多在离散的空域和控制中解决，这将需要额外的路径平滑步骤来为UA提供灵活的命令。为提供无人驾驶飞机的操作安全有效的计算指导，我们探讨了基于近端政策优化（PPO）的深增强学习算法的使用，以指导自主UA到其目的地，同时通过连续控制避免障碍物。所提出的场景状态表示和奖励功能可以将连续状态空间映射到连续控制，以便进行标题角度和速度。为了验证所提出的学习框架的性能，我们用静态和移动障碍进行了数值实验。详细研究了与环境和安全操作界限的不确定性。结果表明，该拟议的模型可以提供准确且强大的指导，并解决了99％以上的成功率的冲突。

translated by 谷歌翻译

Trajectory Generation and Tracking Control for Aggressive Tail-Sitter Flights

Guozheng Lu , Yixi Cai , Nan Chen , Fanze Kong , Yunfan Ren , Fu Zhang

分类：机器人

2022-12-22

We address the theoretical and practical problems related to the trajectory generation and tracking control of tail-sitter UAVs. Theoretically, we focus on the differential flatness property with full exploitation of actual UAV aerodynamic models, which lays a foundation for generating dynamically feasible trajectory and achieving high-performance tracking control. We have found that a tail-sitter is differentially flat with accurate aerodynamic models within the entire flight envelope, by specifying coordinate flight condition and choosing the vehicle position as the flat output. This fundamental property allows us to fully exploit the high-fidelity aerodynamic models in the trajectory planning and tracking control to achieve accurate tail-sitter flights. Particularly, an optimization-based trajectory planner for tail-sitters is proposed to design high-quality, smooth trajectories with consideration of kinodynamic constraints, singularity-free constraints and actuator saturation. The planned trajectory of flat output is transformed to state trajectory in real-time with consideration of wind in environments. To track the state trajectory, a global, singularity-free, and minimally-parameterized on-manifold MPC is developed, which fully leverages the accurate aerodynamic model to achieve high-accuracy trajectory tracking within the whole flight envelope. The effectiveness of the proposed framework is demonstrated through extensive real-world experiments in both indoor and outdoor field tests, including agile SE(3) flight through consecutive narrow windows requiring specific attitude and with speed up to 10m/s, typical tail-sitter maneuvers (transition, level flight and loiter) with speed up to 20m/s, and extremely aggressive aerobatic maneuvers (Wingover, Loop, Vertical Eight and Cuban Eight) with acceleration up to 2.5g.

translated by 谷歌翻译

Optimization-based Motion Planning for Multirotor Aerial Vehicles: a Review

Geesara Kulathunga , Alexandr Klimchik

分类：机器人

2022-08-31

通常，可以将最佳运动计划作为本地和全球执行。在这样的计划中，支持本地或全球计划技术的选择主要取决于环境条件是动态的还是静态的。因此，最适当的选择是与全球计划一起使用本地计划或本地计划。当设计最佳运动计划是本地或全球的时，要记住的关键指标是执行时间，渐近最优性，对动态障碍的快速反应。与其他方法相比，这种计划方法可以更有效地解决上述目标指标，例如路径计划，然后进行平滑。因此，这项研究的最重要目标是分析相关文献，以了解运动计划，特别轨迹计划，问题，当应用于实时生成最佳轨迹的多局部航空车（MAV），影响力（MAV）时如何提出问题。列出的指标。作为研究的结果，轨迹计划问题被分解为一组子问题，详细列出了解决每个问题的方法列表。随后，总结了2010年至2022年最突出的结果，并以时间表的形式呈现。

translated by 谷歌翻译

Traction Adaptive Motion Planning at the Limits of Handling

Lars Svensson , Monimoy Bujarbaruah , Arpit Karsolia , Christian Berger , Martin Törngren

分类：机器人

2020-09-09

在本文中，我们在局部不同的牵引条件下解决了处理限制的运动规划和控制问题。我们提出了一种新的解决方案方法，其中通过源自预测摩擦估计来表示预测地平线上的牵引变化。在后退地平线时装解决了约束的有限时间最佳控制问题，施加了这些时变的约束。此外，我们的方法具有集成的采样增强程序，该过程解决了对突然约束改变而产生的局部最小值的不可行性和敏感性的问题，例如，由于突然的摩擦变化。我们在一系列临界情景中验证了沃尔沃FH16重型车辆的提议算法。实验结果表明，通过确保计划运动的动态可行性，通过确保高牵引利用时，牵引自适应运动规划和控制改善了避免事故的车辆的能力，既通过适应低局部牵引。

translated by 谷歌翻译

Reduced Order Model of a Generic Submarine for Maneuvering Near the Surface

J. Ezequiel Martin , Maxwell Hammond , Nicholas Rober , Yakin Kim , Venanzio Cichella , Pablo Carrica

分类：机器人

2022-12-19

A reduced order model of a generic submarine is presented. Computational fluid dynamics (CFD) results are used to create and validate a model that includes depth dependence and the effect of waves on the craft. The model and the procedure to obtain its coefficients are discussed, and examples of the data used to obtain the model coefficients are presented. An example of operation following a complex path is presented and results from the reduced order model are compared to those from an equivalent CFD calculation. The controller implemented to complete these maneuvers is also presented.

translated by 谷歌翻译

Enhanced method for reinforcement learning based dynamic obstacle avoidance by assessment of collision risk

Fabian Hart , Ostap Okhrin

分类：机器人 | 机器学习

2022-12-08

In the field of autonomous robots, reinforcement learning (RL) is an increasingly used method to solve the task of dynamic obstacle avoidance for mobile robots, autonomous ships, and drones. A common practice to train those agents is to use a training environment with random initialization of agent and obstacles. Such approaches might suffer from a low coverage of high-risk scenarios in training, leading to impaired final performance of obstacle avoidance. This paper proposes a general training environment where we gain control over the difficulty of the obstacle avoidance task by using short training episodes and assessing the difficulty by two metrics: The number of obstacles and a collision risk metric. We found that shifting the training towards a greater task difficulty can massively increase the final performance. A baseline agent, using a traditional training environment based on random initialization of agent and obstacles and longer training episodes, leads to a significantly weaker performance. To prove the generalizability of the proposed approach, we designed two realistic use cases: A mobile robot and a maritime ship under the threat of approaching obstacles. In both applications, the previous results can be confirmed, which emphasizes the general usability of the proposed approach, detached from a specific application context and independent of the agent's dynamics. We further added Gaussian noise to the sensor signals, resulting in only a marginal degradation of performance and thus indicating solid robustness of the trained agent.

translated by 谷歌翻译

Exploration, Path Planning with Obstacle and Collision Avoidance in a Dynamic Environment

Saeid Alirezazadeh , Luís A. Alexandre

分类：机器人

2022-08-19

如果我们给机器人将对象从其当前位置移至未知环境中的另一个位置的任务，则机器人必须探索地图，确定所有类型的障碍物，然后确定完成任务的最佳途径。我们提出了一个数学模型，以找到一个最佳的路径计划，以避免与所有静态和移动障碍物发生冲突，并具有最小的完成时间和最小距离。在此模型中，不考虑障碍物和机器人周围的边界框，因此机器人可以在不与它们相撞的情况下非常接近障碍物移动。我们考虑了两种类型的障碍：确定性，其中包括所有静态障碍，例如不移动的墙壁以及所有动作具有固定模式和非确定性的移动障碍，其中包括所有障碍物，其运动都可以在任何方向上发生任何方向发生概率分布随时。我们还考虑了机器人的加速和减速，以改善避免碰撞的速度。

translated by 谷歌翻译

Formation control with connectivity assurance for missile swarm: a natural co-evolutionary strategy approach

Junda Chen

分类：神经与进化计算

2022-08-24

形成控制问题是群体智能领域中最关心的主题之一，通常通过常规数学方法来解决。然而，在本文中，我们提出了一种元疗法方法，该方法利用了一种自然的共同进化策略来解决一群导弹的形成控制问题。导弹群是由具有异质参考目标的二阶系统建模的，并将指数误差函数作为目标函数，以使群体融合到满足某些形成要求的最佳平衡状态。为了关注本地最佳和不稳定进化的问题，我们纳入了一种新颖的基于模型的政策约束和人口适应策略，从而大大减轻了绩效退化。通过在网络通信领域中应用Molloy reed标准，我们开发了一种自适应拓扑方法，该方法可以通过理论和实验验证节点失败及其有效性下的连通性及其有效性。实验结果有助于提议的形成控制方法的有效性。更重要的是，我们表明将通用形成控制问题视为马尔可夫决策过程（MDP）并通过迭代学习解决它是可行的。

translated by 谷歌翻译

HTML版本

Backflipping with Miniature Quadcopters by Gaussian Process Based Control and Planning

Péter Antal , Tamás Péni , Roland Tóth

分类：机器人

2022-09-29

该论文提出了两种控制方法，用于用微型四轮驱动器进行反弹式操纵。首先，对专门为反转设计设计的现有前馈控制策略进行了修订和改进。使用替代高斯工艺模型的贝叶斯优化通过在模拟环境中反复执行翻转操作来找到最佳运动原语序列。第二种方法基于闭环控制，它由两个主要步骤组成：首先，即使在模型不确定性的情况下，自适应控制器也旨在提供可靠的参考跟踪。控制器是通过通过测量数据调整的高斯过程来增强无人机的标称模型来构建的。其次，提出了一种有效的轨迹计划算法，该算法仅使用二次编程来设计可行的轨迹为反弹操作设计。在模拟和使用BitCraze Crazyflie 2.1四肢旋转器中对两种方法进行了分析。

translated by 谷歌翻译

Perching on Moving Inclined Surfaces using Uncertainty Tolerant Planner and Thrust Regulation

Sensen Liu , Wenkang Hu , Zhaoying Wang , Wei Dong , Xinjun Sheng

分类：机器人

2022-12-21

Quadrotors with the ability to perch on moving inclined surfaces can save energy and extend their travel distance by leveraging ground vehicles. Achieving dynamic perching places high demands on the performance of trajectory planning and terminal state accuracy in SE(3). However, in the perching process, uncertainties in target surface prediction, tracking control and external disturbances may cause trajectory planning failure or lead to unacceptable terminal errors. To address these challenges, we first propose a trajectory planner that considers adaptation to uncertainties in target prediction and tracking control. To facilitate this work, the reachable set of quadrotors' states is first analyzed. The states whose reachable sets possess the largest coverage probability for uncertainty targets, are defined as optimal waypoints. Subsequently, an approach to seek local optimal waypoints for static and moving uncertainty targets is proposed. A real-time trajectory planner based on optimized waypoints is developed accordingly. Secondly, thrust regulation is also implemented in the terminal attitude tracking stage to handle external disturbances. When a quadrotor's attitude is commanded to align with target surfaces, the thrust is optimized to minimize terminal errors. This makes the terminal position and velocity be controlled in closed-loop manner. Therefore, the resistance to disturbances and terminal accuracy is improved. Extensive simulation experiments demonstrate that our methods can improve the accuracy of terminal states under uncertainties. The success rate is approximately increased by $50\%$ compared to the two-end planner without thrust regulation. Perching on the rear window of a car is also achieved using our proposed heterogeneous cooperation system outdoors. This validates the feasibility and practicality of our methods.

translated by 谷歌翻译

Incremental Correction in Dynamic Systems Modelled with Neural Networks for Constraint Satisfaction

Namhoon Cho , Hyo-Sang Shin , Antonios Tsourdos , Davide Amato

分类：机器学习

2022-09-08

这项研究提出了用于完善神经网络参数或进入连续时间动态系统的控制功能的增量校正方法，以提高解决方案精度，以满足对性能输出变量放置的临时点约束。所提出的方法是将其参数基线围绕基线值的动力学线性化，然后求解将扰动轨迹传输到特定时间点（即临时点）处所需的纠正输入。根据要调整的决策变量的类型，参数校正和控制功能校正方法将开发出来。这些增量校正方法可以用作补偿实时应用中预训练的神经网络的预测错误的手段，在实时应用中，必须在规定的时间点上高精度预测动态系统的准确性。在这方面，在线更新方法可用于增强有限摩托控制的整体靶向准确性，但使用神经政策受到点约束。数值示例证明了拟议方法在火星上的动力下降问题中的应用中的有效性。

translated by 谷歌翻译

Dynamic Control Barrier Function-based Model Predictive Control to Safety-Critical Obstacle-Avoidance of Mobile Robot

Zhuozhu Jian , Zihong Yan , Xuanang Lei , Zihong Lu , Bin Lan , Xueqian Wang , Bin Liang

分类：机器人

2022-09-18

本文提出了一种有效且安全的方法，可以避免基于LiDAR的静态和动态障碍。首先，点云用于生成实时的本地网格映射以进行障碍物检测。然后，障碍物由DBSCAN算法聚集，并用最小边界椭圆（MBE）包围。此外，进行数据关联是为了使每个MBE与当前帧中的障碍匹配。考虑到MBE作为观察，Kalman滤波器（KF）用于估计和预测障碍物的运动状态。通过这种方式，可以将远期时间域中每个障碍物的轨迹作为一组椭圆化。由于MBE的不确定性，参数化椭圆形的半肢和半尺寸轴被扩展以确保安全性。我们扩展了传统的控制屏障功能（CBF），并提出动态控制屏障功能（D-CBF）。我们将D-CBF与模型预测控制（MPC）结合起来，以实施安全至关重要的动态障碍。进行了模拟和实际场景中的实验，以验证我们算法的有效性。源代码发布以供社区参考。

translated by 谷歌翻译

Intention-Aware Navigation in Crowds with Extended-Space POMDP Planning

Himanshu Gupta , Bradley Hayes , Zachary Sunberg

分类：机器人 | 人工智能

2022-06-20

本文介绍了一个混合在线的部分可观察到的马尔可夫决策过程（POMDP）计划系统，该系统在存在环境中其他代理商引入的多模式不确定性的情况下解决了自主导航的问题。作为一个特别的例子，我们考虑了密集的行人和障碍物中的自主航行问题。该问题的流行方法首先使用完整的计划者（例如，混合A*）生成一条路径，具有对不确定性的临时假设，然后使用基于在线树的POMDP求解器来解决问题的不确定性，并控制问题的有限方面（即沿着路径的速度）。我们提出了一种更有能力和响应的实时方法，使POMDP规划师能够控制更多的自由度（例如，速度和标题），以实现更灵活，更有效的解决方案。这种修改大大扩展了POMDP规划师必须推荐的国家空间区域，从而大大提高了在实时控制提供的有限计算预算中找到有效的推出政策的重要性。我们的关键见解是使用多Query运动计划技术（例如，概率路线图或快速行进方法）作为先验，以快速生成在有限的地平线搜索中POMDP规划树可能达到的每个状态的高效推出政策。我们提出的方法产生的轨迹比以前的方法更安全，更有效，即使在较长的计划范围内密集拥挤的动态环境中。

translated by 谷歌翻译

Delay-aware Robust Control for Safe Autonomous Driving and Racing

Dvij Kalaria , Qin Lin , John M. Dolan

分类：机器人

2022-08-29

延迟在迅速变化的环境中运行的自主系统的危害安全性，例如在自动驾驶和高速赛车方面的交通参与者的非确定性。不幸的是，在传统的控制器设计或在物理世界中部署之前，通常不考虑延迟。在本文中，从非线性优化到运动计划和控制以及执行器引起的其他不可避免的延迟的计算延迟被系统地和统一解决。为了处理所有这些延迟，在我们的框架中：1）我们提出了一种新的过滤方法，而没有事先了解动态和干扰分布的知识，以适应，安全地估算时间变化的计算延迟； 2）我们为转向延迟建模驱动动力学； 3）所有约束优化均在强大的管模型预测控制器中实现。对于应用的优点，我们证明我们的方法适合自动驾驶和自动赛车。我们的方法是独立延迟补偿控制器的新型设计。此外，在假设无延迟作为主要控制器的学习控制器的情况下，我们的方法是主要控制器的安全保护器。

translated by 谷歌翻译

Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Learning

Brian Gaudet , Isaac Charcos , Roberto Furfaro

分类：机器人

2021-09-08

我们应用META强化学习框架，优化用于空运导弹的集成和自适应引导和飞行控制系统，实现系统作为深度经常性神经网络（政策）。该策略地图直接观察到导弹控制表面偏转的导弹变化的变化率，与通过带下雷达导引率测量的计算稳定的视线单元向量的最小处理，从速率陀螺仪测量的估计旋转速度，控制表面偏转角。该系统将截距轨迹引导对机动轨迹，以满足鳍片偏转角上的控制约束，以及视图角度和负载上的路径约束。我们在六个自由度模拟器中测试优化系统，该模拟器包括非线性天线罩模型和挂钩寻求者模型。通过广泛的模拟，我们证明该系统可以适应大型飞行信封和偏离包括空气动力系数参数和压力中心的扰动的标称飞行条件。此外，我们发现该系统对由径向折射，不完美的寻求稳定和传感器比例因子误差引起的寄生态环是强大的。重要的是，我们将我们的系统的性能与三个环路自动驾驶仪耦合的比例导航的纵向模型进行比较，并发现我们的系统优于基准的基准。附加实验研究了从策略和价值函数网络中移除复发层的影响，以及用红外寻求者的性能。

translated by 谷歌翻译

Octocopter Design: Modelling, Control and Motion Planning

Nedim Osmic , Adnan Tahirovic , Bakir Lacevic

分类：机器人

2022-12-02

This book provides a solution to the control and motion planning design for an octocopter system. It includes a particular choice of control and motion planning algorithms which is based on the authors' previous research work, so it can be used as a reference design guidance for students, researchers as well as autonomous vehicles hobbyists. The control is constructed based on a fault tolerant approach aiming to increase the chances of the system to detect and isolate a potential failure in order to produce feasible control signals to the remaining active motors. The used motion planning algorithm is risk-aware by means that it takes into account the constraints related to the fault-dependant and mission-related maneuverability analysis of the octocopter system during the planning stage. Such a planner generates only those reference trajectories along which the octocopter system would be safe and capable of good tracking in case of a single motor fault and of majority of double motor fault scenarios. The control and motion planning algorithms presented in the book aim to increase the overall reliability of the system for completing the mission.

translated by 谷歌翻译

Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field Experiments

Eivind Bøhn , Erlend M. Coates , Dirk Reinhardt , Tor Arne Johansen

分类：机器学习 | 机器人

2021-11-07

由于非线性动力学，执行器约束和耦合的纵向和横向运动，部分地，固定翼无人驾驶飞行器（无人机）的姿态控制是一个困难的控制问题。目前的最先进的自动驾驶仪基于线性控制，因此有限于其有效性和性能。深度加强学习（DRL）是一种通过与受控系统的交互自动发现最佳控制法的机器学习方法，可以处理复杂的非线性动态。我们在本文中展示DRL可以成功学习直接在原始非线性动态上运行的固定翼UAV的态度控制，需要短至三分钟的飞行数据。我们最初在仿真环境中培训我们的模型，然后在飞行测试中部署无人机的学习控制器，向最先进的ArduplaneProportional-Integry-artivation（PID）姿态控制器的表现展示了可比的性能，而无需进一步的在线学习。为了更好地理解学习控制器的操作，我们呈现了对其行为的分析，包括与现有良好调整的PID控制器的比较。

translated by 谷歌翻译

A Hierarchical Approach for Strategic Motion Planning in Autonomous Racing

Rudolf Reiter , Jasper Hoffmann , Joschka Boedecker , Moritz Diehl

分类：机器人

2022-12-03

We present an approach for safe trajectory planning, where a strategic task related to autonomous racing is learned sample-efficient within a simulation environment. A high-level policy, represented as a neural network, outputs a reward specification that is used within the cost function of a parametric nonlinear model predictive controller (NMPC). By including constraints and vehicle kinematics in the NLP, we are able to guarantee safe and feasible trajectories related to the used model. Compared to classical reinforcement learning (RL), our approach restricts the exploration to safe trajectories, starts with a good prior performance and yields full trajectories that can be passed to a tracking lowest-level controller. We do not address the lowest-level controller in this work and assume perfect tracking of feasible trajectories. We show the superior performance of our algorithm on simulated racing tasks that include high-level decision making. The vehicle learns to efficiently overtake slower vehicles and to avoid getting overtaken by blocking faster vehicles.

translated by 谷歌翻译