Medical UltraSound现在已成为常规检查方法,并且广泛采用不同的医疗应用,因此希望具有机器人超声系统,以自主地执行超声波扫描。然而,超声扫描技能相当复杂,这高度取决于超声医生的经验。在本文中,我们提出了一种基于学习的方法来学习人类示范的机器人超声扫描技能。首先,机器人超声扫描技术被封装到高维数模型模型中,该模型采用超声图像,探头的姿势/位置和接触力。其次,我们利用模仿学习的力量培训了从经验丰富的超声医生的示范中收集的培训数据训练多模态模型。最后,提出了一种带导游探索的优化过程,以进一步提高学习模型的性能。进行机器人实验以验证我们提出的框架和学习模式的优势。
translated by 谷歌翻译
超声(US)成像通常用于协助诊断和脊柱疾病的干预,而通过手动操作探针进行标准化美国收购需要大量的经验和超声检查的培训。在这项工作中,我们提出了一种新的双代理框架,集成了强化学习(RL)代理和深度学习(DL)代理,以共同确定基于实时超声图像美国探测器的移动,以模拟专家超声检查操作者的决策过程,以实现脊柱超声自主标准视图收购。此外,通过美国传播的性质和脊柱解剖的特性的启发,我们引入一个视图特定的声影奖励利用阴影信息来隐式地引导朝向脊柱的不同标准视图探针的导航。我们的方法在从$ $ 17名志愿者获得的美国经济数据建立了一个模拟环境的定量和定性实验验证。平均导航精度朝向不同的标准视图达到$5.18毫米/ 5.25 ^ \ CIRC $ $和12.87毫米/ 17.49 ^ \ CIRC $在分子内和主体间设置,分别。结果表明,我们的方法可以有效地解释美国的图像和导航探头获取脊柱多种标准的意见。
translated by 谷歌翻译
Dexterous manipulation with anthropomorphic robot hands remains a challenging problem in robotics because of the high-dimensional state and action spaces and complex contacts. Nevertheless, skillful closed-loop manipulation is required to enable humanoid robots to operate in unstructured real-world environments. Reinforcement learning (RL) has traditionally imposed enormous interaction data requirements for optimizing such complex control problems. We introduce a new framework that leverages recent advances in GPU-based simulation along with the strength of imitation learning in guiding policy search towards promising behaviors to make RL training feasible in these domains. To this end, we present an immersive virtual reality teleoperation interface designed for interactive human-like manipulation on contact rich tasks and a suite of manipulation environments inspired by tasks of daily living. Finally, we demonstrate the complementary strengths of massively parallel RL and imitation learning, yielding robust and natural behaviors. Videos of trained policies, our source code, and the collected demonstration datasets are available at https://maltemosbach.github.io/interactive_ human_like_manipulation/.
translated by 谷歌翻译
Imitation learning techniques aim to mimic human behavior in a given task. An agent (a learning machine) is trained to perform a task from demonstrations by learning a mapping between observations and actions. The idea of teaching by imitation has been around for many years, however, the field is gaining attention recently due to advances in computing and sensing as well as rising demand for intelligent applications. The paradigm of learning by imitation is gaining popularity because it facilitates teaching complex tasks with minimal expert knowledge of the tasks. Generic imitation learning methods could potentially reduce the problem of teaching a task to that of providing demonstrations; without the need for explicit programming or designing reward functions specific to the task. Modern sensors are able to collect and transmit high volumes of data rapidly, and processors with high computational power allow fast processing that maps the sensory data to actions in a timely manner. This opens the door for many potential AI applications that require real-time perception and reaction such as humanoid robots, self-driving vehicles, human computer interaction and computer games to name a few. However, specialized algorithms are needed to effectively and robustly learn models as learning by imitation poses its own set of challenges. In this paper, we survey imitation learning methods and present design options in different steps of the learning process. We introduce a background and motivation for the field as well as highlight challenges specific to the imitation problem. Methods for designing and evaluating imitation learning tasks are categorized and reviewed. Special attention is given to learning methods in robotics and games as these domains are the most popular in the literature and provide a wide array of problems and methodologies. We extensively discuss combining imitation learning approaches using different sources and methods, as well as incorporating other motion learning methods to enhance imitation. We also discuss the potential impact on industry, present major applications and highlight current and future research directions.
translated by 谷歌翻译
在机器学习中使用大型数据集已导致出色的结果,在某些情况下,在机器上认为不可能的任务中的人数优于人类。但是,在处理身体上的互动任务时,实现人类水平的表现,例如,在接触良好的机器人操作中,仍然是一个巨大的挑战。众所周知,规范笛卡尔阻抗进行此类行动对于成功执行至关重要。加强学习(RL)之类的方法可能是解决此类问题的有希望的范式。更确切地说,在解决新任务具有巨大潜力时,使用任务不足的专家演示的方法可以利用大型数据集。但是,现有的数据收集系统是昂贵,复杂的,或者不允许进行阻抗调节。这项工作是朝着数据收集框架迈出的第一步,适合收集与使用新颖的动作空间的RL问题公式相容的基于阻抗的专家演示的大型数据集。该框架是根据对机器人操纵的可用数据收集框架进行广泛分析后根据要求设计的。结果是一个低成本且开放的远程阻抗框架,它使人类专家能够展示接触式任务。
translated by 谷歌翻译
在本次调查中,我们介绍了执行需要不同于环境的操作任务的机器人的当前状态,使得机器人必须隐含地或明确地控制与环境的接触力来完成任务。机器人可以执行越来越多的人体操作任务,并且在1)主题上具有越来越多的出版物,其执行始终需要联系的任务,并且通过利用完美的任务来减轻环境来缓解不确定性信息,可以在没有联系的情况下进行。最近的趋势已经看到机器人在留下的人类留给人类,例如按摩,以及诸如PEG孔的经典任务中,对其他类似任务的概率更有效,更好的误差容忍以及更快的规划或学习任务。因此,在本调查中,我们涵盖了执行此类任务的机器人的当前阶段,从调查开始所有不同的联系方式机器人可以执行,观察这些任务是如何控制和表示的,并且最终呈现所需技能的学习和规划完成这些任务。
translated by 谷歌翻译
越来越多的人期望在对象属性具有高感知不确定性的越来越多的非结构化环境中操纵对象。这直接影响成功的对象操纵。在这项工作中,我们提出了一个基于增强的学习动作计划框架,用于对象操纵,该框架既利用了在现有的多感觉反馈,也可以使用学习的注意力引导的深层负担能力模型作为感知状态。可承受的模型是从多种感官方式中学到的,包括视觉和触摸(触觉和力/扭矩),旨在预测和指示具有相似外观的物体的多个负担能力(即抓地力和推动力)的可操作区域属性(例如,质量分布)。然后,对基于DQN的深钢筋学习算法进行培训,以选择成功对象操纵的最佳动作。为了验证提出的框架的性能,使用开放数据集和收集的数据集对我们的方法进行评估和基准测试。结果表明,所提出的方法和整体框架的表现优于现有方法,并实现更好的准确性和更高的效率。
translated by 谷歌翻译
在本文中,我们讨论了通过模仿教授双人操作任务的框架。为此,我们提出了一种从人类示范中学习合规和接触良好的机器人行为的系统和算法。提出的系统结合了入学控制和机器学习的见解,以提取控制政策,这些政策可以(a)从时空和空间中恢复并适应各种干扰,同时(b)有效利用与环境的物理接触。我们使用现实世界中的插入任务证明了方法的有效性,该任务涉及操纵对象和插入钉之间的多个同时接触。我们还研究了为这种双人设置收集培训数据的有效方法。为此,我们进行了人类受试者的研究,并分析用户报告的努力和精神需求。我们的实验表明,尽管很难提供,但在遥控演示中可用的其他力/扭矩信息对于阶段估计和任务成功至关重要。最终,力/扭矩数据大大提高了操纵鲁棒性,从而在多点插入任务中获得了90%的成功率。可以在https://bimanualmanipulation.com/上找到代码和视频
translated by 谷歌翻译
学识渊博的视觉运动策略已取得了相当大的成功,作为用于机器人操纵的传统手工制作框架的替代方法。令人惊讶的是,这些方法向多视域域的扩展相对尚未探索。可以在移动操作平台上部署成功的多视策略,从而使机器人可以完成任务,无论其场景的看法如何。在这项工作中,我们证明可以通过从各种观点收集数据来通过模仿学习来找到多览策略。我们通过在模拟环境和真实的移动操纵平台上学习完成几个具有挑战性的多阶段和接触任务来说明该方法的一般适用性。此外,与从固定角度收集的数据相比,我们分析了我们的政策,以确定从多视图数据中学习的好处。我们表明,与使用等效量的固定视图数据相比,从多视图数据中学习对固定视图任务的惩罚很少(如果有的话)。最后,我们研究了多视图和固定视图策略所学的视觉特征。我们的结果表明,多视图策略隐含地学习识别与空间相关的特征。
translated by 谷歌翻译
虽然对理解计算机视觉中的手对象交互进行了重大进展,但机器人执行复杂的灵巧操纵仍然非常具有挑战性。在本文中,我们提出了一种新的平台和管道DEXMV(来自视频的Dexerous操纵)以进行模仿学习。我们设计了一个平台:(i)具有多指机器人手和(ii)计算机视觉系统的复杂灵巧操纵任务的仿真系统,以记录进行相同任务的人类手的大规模示范。在我们的小说管道中,我们从视频中提取3D手和对象姿势,并提出了一种新颖的演示翻译方法,将人类运动转换为机器人示范。然后,我们将多个仿制学习算法与演示进行应用。我们表明,示威活动确实可以通过大幅度提高机器人学习,并解决独自增强学习无法解决的复杂任务。具有视频的项目页面:https://yzqin.github.io/dexmv
translated by 谷歌翻译
Surgical robot automation has attracted increasing research interest over the past decade, expecting its huge potential to benefit surgeons, nurses and patients. Recently, the learning paradigm of embodied AI has demonstrated promising ability to learn good control policies for various complex tasks, where embodied AI simulators play an essential role to facilitate relevant researchers. However, existing open-sourced simulators for surgical robot are still not sufficiently supporting human interactions through physical input devices, which further limits effective investigations on how human demonstrations would affect policy learning. In this paper, we study human-in-the-loop embodied intelligence with a new interactive simulation platform for surgical robot learning. Specifically, we establish our platform based on our previously released SurRoL simulator with several new features co-developed to allow high-quality human interaction via an input device. With these, we further propose to collect human demonstrations and imitate the action patterns to achieve more effective policy learning. We showcase the improvement of our simulation environment with the designed new features and tasks, and validate state-of-the-art reinforcement learning algorithms using the interactive environment. Promising results are obtained, with which we hope to pave the way for future research on surgical embodied intelligence. Our platform is released and will be continuously updated in the website: https://med-air.github.io/SurRoL/
translated by 谷歌翻译
Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper we introduce a full framework for learning feedback models for reactive motion planning. Our pipeline starts by segmenting demonstrations of a complete task into motion primitives via a semi-automated segmentation algorithm. Then, given additional demonstrations of successful adaptation behaviors, we learn initial feedback models through learning from demonstrations. In the final phase, a sample-efficient reinforcement learning algorithm fine-tunes these feedback models for novel task settings through few real system interactions. We evaluate our approach on a real anthropomorphic robot in learning a tactile feedback task.
translated by 谷歌翻译
预计机器人将取代诸如家务之类的琐碎任务。其中一些任务包括执行的无毛线操作,而无需抓住对象。非忧虑的操作非常困难,因为它需要考虑环境和对象的动态。因此,模仿复杂行为需要大量的人类示范。在这项研究中,提出了一种自我监督的学习,该学习认为动态以实现可变速度进行非骚扰操作。所提出的方法仅收集自主操作期间获得的成功动作数据。通过微调成功的数据,机器人可以学习自身,环境和对象之间的动态。我们尝试使用对24个人类收集的培训数据训练的神经网络模型来挖掘和运输煎饼的任务。所提出的方法将成功率从40.2%提高到85.7%,并成功完成了其他物体的任务超过75%。
translated by 谷歌翻译
在现代制造环境中,对接触式任务的需求正在迅速增长。但是,很少有传统的机器人组装技能考虑任务执行过程中的环境限制,并且大多数人将这些限制作为终止条件。在这项研究中,我们提出了基于推动的混合位置/力组装技能,该技能可以在任务执行过程中最大化环境限制。据我们所知,这是在执行程序集任务期间使用推动操作考虑的第一项工作。我们已经证明,我们的技能可以使用移动操纵器系统组装任务实验最大化环境约束的利用,并在执行中实现100 \%的成功率。
translated by 谷歌翻译
通过改变肌肉僵硬来适应符合性的能力对于人类灵巧的操纵技巧至关重要。在机器人电动机控制中纳入合规性对于执行具有人级敏捷性的现实力量相互作用任务至关重要。这项工作为合规机器人操作提供了一个深层的模型预测性变量阻抗控制器,该阻抗操纵结合了可变阻抗控制与模型预测控制(MPC)。使用最大化信息增益的勘探策略学习了机器人操纵器的广义笛卡尔阻抗模型。该模型在MPC框架内使用,以适应低级变量阻抗控制器的阻抗参数,以实现针对不同操纵任务的所需合规性行为,而无需进行任何重新培训或填充。使用Franka Emika Panda机器人操纵器在模拟和实际实验中运行的操作,使用Franka Emika Panda机器人操纵器评估深层模型预测性变量阻抗控制方法。将所提出的方法与无模型和基于模型的强化方法进行了比较,以可变阻抗控制,以进行任务和性能之间的可传递性。
translated by 谷歌翻译
机器人技术中最重要的挑战之一是产生准确的轨迹并控制其动态参数,以便机器人可以执行不同的任务。提供此类运动控制的能力与此类运动的编码方式密切相关。深度学习的进步在发展动态运动原语的新方法的发展方面产生了强烈的影响。在这项工作中,我们调查了与神经动态运动原始素有关的科学文献,以补充有关动态运动原语的现有调查。
translated by 谷歌翻译
机器人将机器人的无缝集成到人类环境需要机器人来学习如何使用现有的人类工具。学习工具操纵技能的目前方法主要依赖于目标机器人环境中提供的专家演示,例如,通过手动引导机器人操纵器或通过远程操作。在这项工作中,我们介绍了一种自动化方法,取代了一个专家演示,用YouTube视频来学习工具操纵策略。主要贡献是双重的。首先,我们设计一个对齐过程,使模拟环境与视频中观察到的真实世界。这是作为优化问题,找到刀具轨迹的空间对齐,以最大化环境给出的稀疏目标奖励。其次,我们描述了一种专注于工具的轨迹而不是人类的运动的模仿学习方法。为此,我们将加强学习与优化过程相结合,以基于对准环境中的工具运动来找到控制策略和机器人的放置。我们展示了仿真中的铲子,镰刀和锤子工具的建议方法,并展示了训练有素的政策对真正的弗兰卡·埃米卡熊猫机器人示范的卫生政策的有效性。
translated by 谷歌翻译
Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically involves introducing hand-engineered policy representations and human-supplied demonstrations. Deep reinforcement learning alleviates this limitation by training general-purpose neural network policies, but applications of direct deep reinforcement learning algorithms have so far been restricted to simulated settings and relatively simple tasks, due to their apparent high sample complexity. In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on offpolicy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots. We demonstrate that the training times can be further reduced by parallelizing the algorithm across multiple robots which pool their policy updates asynchronously. Our experimental evaluation shows that our method can learn a variety of 3D manipulation skills in simulation and a complex door opening skill on real robots without any prior demonstrations or manually designed representations.
translated by 谷歌翻译
We propose an approach for semantic imitation, which uses demonstrations from a source domain, e.g. human videos, to accelerate reinforcement learning (RL) in a different target domain, e.g. a robotic manipulator in a simulated kitchen. Instead of imitating low-level actions like joint velocities, our approach imitates the sequence of demonstrated semantic skills like "opening the microwave" or "turning on the stove". This allows us to transfer demonstrations across environments (e.g. real-world to simulated kitchen) and agent embodiments (e.g. bimanual human demonstration to robotic arm). We evaluate on three challenging cross-domain learning problems and match the performance of demonstration-accelerated RL approaches that require in-domain demonstrations. In a simulated kitchen environment, our approach learns long-horizon robot manipulation tasks, using less than 3 minutes of human video demonstrations from a real-world kitchen. This enables scaling robot learning via the reuse of demonstrations, e.g. collected as human videos, for learning in any number of target domains.
translated by 谷歌翻译