Fine-grained capturing of 3D HOI boosts human activity understanding and facilitates downstream visual tasks, including action recognition, holistic scene reconstruction, and human motion synthesis. Despite its significance, existing works mostly assume that humans interact with rigid objects using only a few body parts, limiting their scope. In this paper, we address the challenging problem of f-AHOI, wherein the whole human bodies interact with articulated objects, whose parts are connected by movable joints. We present CHAIRS, a large-scale motion-captured f-AHOI dataset, consisting of 16.2 hours of versatile interactions between 46 participants and 81 articulated and rigid sittable objects. CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process, as well as realistic and physically plausible full-body interactions. We show the value of CHAIRS with object pose estimation. By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation to tackle the estimation of articulated object poses and shapes during whole-body interactions. Given an image and an estimated human pose, our model first reconstructs the pose and shape of the object, then optimizes the reconstruction according to a learned interaction prior. Under both evaluation settings (e.g., with or without the knowledge of objects' geometries/structures), our model significantly outperforms baselines. We hope CHAIRS will promote the community towards finer-grained interaction understanding. We will make the data/code publicly available.
translated by 谷歌翻译
If scientific discovery is one of the main driving forces of human progress, insight is the fuel for the engine, which has long attracted behavior-level research to understand and model its underlying cognitive process. However, current tasks that abstract scientific discovery mostly focus on the emergence of insight, ignoring the special role played by domain knowledge. In this concept paper, we view scientific discovery as an interplay between $thinking \ out \ of \ the \ box$ that actively seeks insightful solutions and $thinking \ inside \ the \ box$ that generalizes on conceptual domain knowledge to keep correct. Accordingly, we propose Mindle, a semantic searching game that triggers scientific-discovery-like thinking spontaneously, as infrastructure for exploring scientific discovery on a large scale. On this basis, the meta-strategies for insights and the usage of concepts can be investigated reciprocally. In the pilot studies, several interesting observations inspire elaborated hypotheses on meta-strategies, context, and individual diversity for further investigations.
translated by 谷歌翻译
We study the hidden-action principal-agent problem in an online setting. In each round, the principal posts a contract that specifies the payment to the agent based on each outcome. The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal. The principal observes the outcome and receives utility from the agent's choice of action. Based on past observations, the principal dynamically adjusts the contracts with the goal of maximizing her utility. We introduce an online learning algorithm and provide an upper bound on its Stackelberg regret. We show that when the contract space is $[0,1]^m$, the Stackelberg regret is upper bounded by $\widetilde O(\sqrt{m} \cdot T^{1-C/m})$, and lower bounded by $\Omega(T^{1-1/(m+2)})$. This result shows that exponential-in-$m$ samples are both sufficient and necessary to learn a near-optimal contract, resolving an open problem on the hardness of online contract design. When contracts are restricted to some subset $\mathcal{F} \subset [0,1]^m$, we define an intrinsic dimension of $\mathcal{F}$ that depends on the covering number of the spherical code in the space and bound the regret in terms of this intrinsic dimension. When $\mathcal{F}$ is the family of linear contracts, the Stackelberg regret grows exactly as $\Theta(T^{2/3})$. The contract design problem is challenging because the utility function is discontinuous. Bounding the discretization error in this setting has been an open problem. In this paper, we identify a limited set of directions in which the utility function is continuous, allowing us to design a new discretization method and bound its error. This approach enables the first upper bound with no restrictions on the contract and action space.
translated by 谷歌翻译
跟踪位置和方向独立提供了更敏捷的动作,以实现过度射击的多旋翼无人机(UAV),同时引入了不希望的倒入效果;推力发电机产生的倾斜流可能会因接近性而抵消其他流动,从而极大地威胁了平台的稳定性。建模空气动力气流的复杂性挑战了适当补偿这种副作用的算法。利用无人机分配的输入冗余,我们通过新的控制分配框架来解决此问题,该框架考虑了倾斜效果,并探索了整个分配空间以获得最佳解决方案。该最佳解决方案避免了倾斜效果,同时在硬件约束中提供了高推力效率。据我们所知,我们的是第一个调查对过度驱动无人机的倾斜影响的正式推导。我们在模拟和实验中验证了不同硬件配置的框架。
translated by 谷歌翻译
我们设计一个3D场景图表示,触点图+(CG+),以进行有效的顺序任务计划。此触点基于图形的表示,带有类似谓词的属性,带有简洁的几何信息和有效的机器人风格交互作用摘要场景布局。可以通过随机优化方法的遗传算法生成触点图上自然指定的目标配置。然后,通过计算初始触点图和目标配置之间的图形编辑距离(GED)来初始化任务计划,该图形配置生成了与可能的机器人操作相对应的图表编辑操作。我们通过强加约束来调节图形编辑操作的时间可行性,确保有效的任务和运动对应关系来最终确定任务计划。在一系列的模拟和实验中,机器人成功完成了使用常规规划语言(如计划域定义语言(PDDL))很难指定的复杂顺序重新安排任务,证明了机器人在接触图上的高可行性和潜力。
translated by 谷歌翻译
我们提出了一个机器人学习和计划框架,该框架以最少的共同努力生成有效的工具使用策略,能够处理不同于培训的物体。利用有限元方法(FEM)基于模拟器,该模拟器在观察到的刀具使用事件给定的细粒度,连续的视觉和物理效果中,通过提出的迭代迭代符号深化回归(IDSR)算法来识别促成效果的基本物理特性。我们进一步设计了一种基于最佳控制的运动计划方案,以整合机器人和特定于工具的运动学和动力学,以产生有效的轨迹,从而实现学习性能。在模拟中,我们证明了所提出的框架可以产生更有效的工具使用策略,这与在两个示例任务中观察到的框架截然不同。
translated by 谷歌翻译
通常通过过去的选择来告知机器学习中的评估,例如要使用哪些数据集或指标。该标准化可以使用排行榜对平等基础进行比较,但是随着出现更好的替代方案,评估选择变得不佳。这个问题在自然语言生成中尤其相关,该语言需要不断改善的数据集,指标和人类评估以提出确定性的主张。为了使遵循最佳模型评估实践更加容易,我们介绍了GEMV2。新版本的一代,评估和指标基准为数据集,模型和指标开发人员提供了模块化基础架构,以使彼此受益。GEMV2支持40种记录的数据集中51种语言。所有数据集的模型都可以在线评估,我们的交互式数据卡创建和渲染工具使得在Living Benchmark中添加新数据集变得更加容易。
translated by 谷歌翻译
理论思想和实证研究向我们展示了一个看似令人惊讶的结果:孩子,甚至很年轻的孩子,都以与正式研究中的科学推理非常相似的方式展示学习和思考。遇到一种新现象,儿童对数据提出假设,从观察进行因果推断,通过实验检验其理论,并纠正是否出现不一致的命题。此类过程的回合一直持续到发现基本机制为止。建立可以像人一样学习和思考的机器,我们要问的一个自然的问题是:我们今天实现的智能是否设法执行这样的科学思维过程,以及在什么水平上进行的。在这项工作中,我们设计了EST环境,以评估人造药物中的科学思维能力。在因果发现的研究流中,我们基于爆炸检测来构建我们的交互式EST环境。具体而言,在EST的每个情节中,都会呈现一个新颖的观察结果,并要求找出所有对象的衰落。在每个时间步骤中,代理都提出了新的实验来验证其假设并更新其当前信念。通过在此任务的象征和视觉版本上评估强化学习(RL)代理,我们注意到当今学习方法的明显失败在达到与人类相当的智力水平方面。科学思维中学习的这种效率低下,需要在建立人类智能方面进行未来的研究。
translated by 谷歌翻译
潜在空间基于能量的模型(EBM),也称为基于能量的先验,引起了对生成建模的日益兴趣。由于其在潜在空间的配方和强大的建模能力方面的灵活性所推动,最近构建的作品已经进行了有趣的尝试,目的是针对文本建模的解释性。但是,潜在空间EBM还继承了数据空间中EBM的一些缺陷。实践中退化的MCMC抽样质量会导致培训中的发电质量和不稳定差,尤其是在具有复杂潜在结构的数据上。受到最近的努力的启发,该努力利用扩散恢复的可能性学习是解决抽样问题的一种方法,我们在变异学习框架中引入了扩散模型和潜在空间EBM之间的新型共生,这是潜在扩散能量基于能量的模型。我们与信息瓶颈共同开发基于几何聚类的正则化,以进一步提高学到的潜在空间的质量。对几个具有挑战性的任务进行的实验证明了我们模型在可解释的文本建模上的优越性能而不是强大的同行。
translated by 谷歌翻译
我们设计了一个合作规划框架,为束缚机器人Duo产生最佳轨迹,该轨迹是用柔性网聚集在大面积中蔓延的散射物体。具体地,所提出的规划框架首先为每个机器人生产一组密集的航点,用作优化的初始化。接下来,我们制定迭代优化方案,以产生平滑和无碰撞的轨迹,同时确保机器人DUO内的合作,以有效地收集物体并正确避免障碍物。我们使用模型参考自适应控制器(MRAC)验证模拟中的生成轨迹,并在物理机器人中实现它们,以处理携带有效载荷的未知动态。在一系列研究中,我们发现:(i)U形成本函数在规划合作机器人DUO方面是有效的,并且(ii)任务效率并不总是与系绳网的长度成比例。鉴于环境配置,我们的框架可以衡量最佳净长度。为了我们的最佳知识,我们的最初是第一个为系列机器人二人提供此类估算。
translated by 谷歌翻译