训练深度强化学习(DRL)运动策略通常需要大量数据以融合到所需的行为。在这方面,模拟器提供了便宜而丰富的来源。对于成功的SIM到现实转移,通常采用详尽的设计方法,例如系统识别,动态随机化和域的适应性。作为替代方案,我们研究了一种简单的随机力注射策略(RFI),以在训练过程中扰动系统动力学。我们表明,随机力的应用使我们能够模拟动力学随机化。这使我们能够获得对系统动力学变化的强大运动策略。我们通过引入情节驱动偏移,进一步扩展了RFI,称为延长的随机力注射(ERFI)。我们证明,ERFI为系统质量提供的变化提供了额外的鲁棒性,平均提供了比RFI的性能提高61%。我们还表明,ERFI足以在两个不同的四足动物平台(Anymal C和Unitree A1)上成功进行SIM到真实传输,即使在户外环境中对不均匀的地形上的感知运动也是如此。
translated by 谷歌翻译
在腿部机器人技术中,计划和执行敏捷的机动演习一直是一个长期的挑战。它需要实时得出运动计划和本地反馈政策,以处理动力学动量的非物质。为此,我们提出了一个混合预测控制器,该控制器考虑了机器人的致动界限和全身动力学。它将反馈政策与触觉信息相结合,以在本地预测未来的行动。由于采用可行性驱动的方法,它在几毫秒内收敛。我们的预测控制器使Anymal机器人能够在现实的场景中生成敏捷操作。关键要素是跟踪本地反馈策略,因为与全身控制相反,它们达到了所需的角动量。据我们所知,我们的预测控制器是第一个处理驱动限制,生成敏捷的机动操作以及执行低级扭矩控制的最佳反馈策略,而无需使用单独的全身控制器。
translated by 谷歌翻译
多年来,运动规划,映射和人类轨迹预测的单独领域显着提出。然而,在提供能够使移动操纵器能够执行全身运动并考虑移动障碍物的预测运动时,文献在提供实际框架方面仍然稀疏。基于以前的优化的运动计划方法,使用距离字段遭受更新环境表示所需的高计算成本。我们证明,与从头划痕计算距离场相比,GPU加速预测的复合距离场显着降低计算时间。我们将该技术与完整的运动规划和感知框架集成,其占据动态环境中的人类的预测运动,从而实现了包含预测动作的反应性和先发制人的运动规划。为实现这一目标,我们提出并实施了一种新颖的人类轨迹预测方法,该方法结合了基于轨迹优化的运动规划的意图识别。我们在现实世界丰田人类支持机器人(HSR)上验证了我们的由Onboard Camera的现场RGB-D传感器数据验证了我们的结果框架。除了在公开的数据集提供分析外,我们还释放了牛津室内人类运动(牛津-IHM)数据集,并在人类轨迹预测中展示了最先进的性能。牛津-IHM数据集是一个人类轨迹预测数据集,人们在室内环境中的兴趣区域之间行走。静态和机器人安装的RGB-D相机都观察了用运动捕获系统跟踪的人员。
translated by 谷歌翻译
四足球运动正在迅速成熟到现在的机器人经常穿越各种非结构化的地形。然而,虽然通过从一系列预计算机样式中选择Gaits可以改变Gaits,但是当机器人处于运动中,当前规划仪不能连续地变化关键的步态参数。具有意外操作特性的综合,现有的Gaits,甚至是动态演习的混合延伸超出了当前最先进的能力。在这项工作中,我们通过学习捕获构成特定步态的关键姿态阶段的潜在空间来解决这种限制。这是通过在单个小跑风格上训练的生成模型来实现的,这鼓励解散,使得将驱动信号应用于潜在的单个维度,诱导合成连续各种跑步的整体计划。我们证明了驱动信号映射的特定性质直接映射到诸如Cadence,脚步高度和完全姿势持续时间的步态参数。由于我们的方法的性质,这些合成的Gaits在机器人操作期间在线在线持续变量,强大地捕获了显着超过培训期间看到的相对狭窄的行为的流动丰富性。此外,使用生成模型的使用促进了对扰动的检测和减轻,以提供多功能和坚固的规划框架。我们在真正的Quadruped机器人上评估我们的方法,并证明我们的方法实现了动态小跑风格的连续混合,同时对外部扰动具有鲁棒性和反应性。
translated by 谷歌翻译
由于事件的范围有限,在复杂且高度可变的环境中,避免路径计划和碰撞是具有挑战性的。在文献中,有多种基于模型和学习的方法需要有效地部署大量的计算资源,并且可能具有有限的一般性。我们提出了一种基于全球稳定的被动控制器的计划算法,该算法可以在挑战性的环境条件下使用有限的计算资源计划平滑轨迹。该体系结构将最近提出的分形阻抗控制器与有限时间不变性区域结合在一起。由于该方法基于阻抗控制器,因此它也可以直接用作力/扭矩控制器。我们在模拟中验证了我们的方法,以通过发放Via-toints的发行及其对低带宽反馈的稳健性来分析互动导航在挑战凹域中的能力。使用11个代理的群模拟验证了所提出方法的可扩展性。我们已经在自动式轮式平台上进行了硬件实验,以验证与动态剂(即人和机器人)相互作用的平滑度和稳健性。与依赖数字优化的其他方法相比,所提出的本地规划师的计算复杂性可以通过低功率微控制器的部署降低能源消耗。
translated by 谷歌翻译
差分动态编程(DDP)是用于轨迹优化的直接单射击方法。它的效率来自对时间结构的开发(最佳控制问题固有的)和系统动力学的明确推出/集成。但是,它具有数值不稳定,与直接多个射击方法相比,它的初始化选项有限(允许对控件的初始化,但不能对状态进行初始化),并且缺乏对控制约束的正确处理。在这项工作中,我们采用可行性驱动的方法来解决这些问题,该方法调节数值优化过程中的动态可行性并确保控制限制。我们的可行性搜索模拟了只有动态约束的直接多重拍摄问题的数值解决。我们证明我们的方法(命名为box-fddp)具有比Box-DDP+(单个射击方法)更好的数值收敛性,并且其收敛速率和运行时性能与使用The Solded Sound的最新直接转录配方竞争内部点和主动集算法在Knitro中提供。我们进一步表明,Box-FDP可以单调地降低动态可行性误差 - 与最先进的非线性编程算法相同。我们通过为四足动物和人形机器人产生复杂而运动的运动来证明我们的方法的好处。最后,我们强调说,Box-FDDP适用于腿部机器人中的模型预测控制。
translated by 谷歌翻译
Markowitz mean-variance portfolios with sample mean and covariance as input parameters feature numerous issues in practice. They perform poorly out of sample due to estimation error, they experience extreme weights together with high sensitivity to change in input parameters. The heavy-tail characteristics of financial time series are in fact the cause for these erratic fluctuations of weights that consequently create substantial transaction costs. In robustifying the weights we present a toolbox for stabilizing costs and weights for global minimum Markowitz portfolios. Utilizing a projected gradient descent (PGD) technique, we avoid the estimation and inversion of the covariance operator as a whole and concentrate on robust estimation of the gradient descent increment. Using modern tools of robust statistics we construct a computationally efficient estimator with almost Gaussian properties based on median-of-means uniformly over weights. This robustified Markowitz approach is confirmed by empirical studies on equity markets. We demonstrate that robustified portfolios reach the lowest turnover compared to shrinkage-based and constrained portfolios while preserving or slightly improving out-of-sample performance.
translated by 谷歌翻译
A critical step in sharing semantic content online is to map the structural data source to a public domain ontology. This problem is denoted as the Relational-To-Ontology Mapping Problem (Rel2Onto). A huge effort and expertise are required for manually modeling the semantics of data. Therefore, an automatic approach for learning the semantics of a data source is desirable. Most of the existing work studies the semantic annotation of source attributes. However, although critical, the research for automatically inferring the relationships between attributes is very limited. In this paper, we propose a novel method for semantically annotating structured data sources using machine learning, graph matching and modified frequent subgraph mining to amend the candidate model. In our work, Knowledge graph is used as prior knowledge. Our evaluation shows that our approach outperforms two state-of-the-art solutions in tricky cases where only a few semantic models are known.
translated by 谷歌翻译
Using robots in educational contexts has already shown to be beneficial for a student's learning and social behaviour. For levitating them to the next level of providing more effective and human-like tutoring, the ability to adapt to the user and to express proactivity is fundamental. By acting proactively, intelligent robotic tutors anticipate possible situations where problems for the student may arise and act in advance for preventing negative outcomes. Still, the decisions of when and how to behave proactively are open questions. Therefore, this paper deals with the investigation of how the student's cognitive-affective states can be used by a robotic tutor for triggering proactive tutoring dialogue. In doing so, it is aimed to improve the learning experience. For this reason, a concept learning task scenario was observed where a robotic assistant proactively helped when negative user states were detected. In a learning task, the user's states of frustration and confusion were deemed to have negative effects on the outcome of the task and were used to trigger proactive behaviour. In an empirical user study with 40 undergraduate and doctoral students, we studied whether the initiation of proactive behaviour after the detection of signs of confusion and frustration improves the student's concentration and trust in the agent. Additionally, we investigated which level of proactive dialogue is useful for promoting the student's concentration and trust. The results show that high proactive behaviour harms trust, especially when triggered during negative cognitive-affective states but contributes to keeping the student focused on the task when triggered in these states. Based on our study results, we further discuss future steps for improving the proactive assistance of robotic tutoring systems.
translated by 谷歌翻译
We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages. By leveraging a quantized representation of speech as a target, Mu$^{2}$SLAM trains the speech-text models with a sequence-to-sequence masked denoising objective similar to T5 on the decoder and a masked language modeling (MLM) objective on the encoder, for both unlabeled speech and text, while utilizing the supervised tasks to improve cross-lingual and cross-modal representation alignment within the model. On CoVoST AST, Mu$^{2}$SLAM establishes a new state-of-the-art for models trained on public datasets, improving on xx-en translation over the previous best by 1.9 BLEU points and on en-xx translation by 1.1 BLEU points. On Voxpopuli ASR, our model matches the performance of an mSLAM model fine-tuned with an RNN-T decoder, despite using a relatively weaker sequence-to-sequence architecture. On text understanding tasks, our model improves by more than 6\% over mSLAM on XNLI, getting closer to the performance of mT5 models of comparable capacity on XNLI and TydiQA, paving the way towards a single model for all speech and text understanding tasks.
translated by 谷歌翻译