Fine-grained capturing of 3D HOI boosts human activity understanding and facilitates downstream visual tasks, including action recognition, holistic scene reconstruction, and human motion synthesis. Despite its significance, existing works mostly assume that humans interact with rigid objects using only a few body parts, limiting their scope. In this paper, we address the challenging problem of f-AHOI, wherein the whole human bodies interact with articulated objects, whose parts are connected by movable joints. We present CHAIRS, a large-scale motion-captured f-AHOI dataset, consisting of 16.2 hours of versatile interactions between 46 participants and 81 articulated and rigid sittable objects. CHAIRS provides 3D meshes of both humans and articulated objects during the entire interactive process, as well as realistic and physically plausible full-body interactions. We show the value of CHAIRS with object pose estimation. By learning the geometrical relationships in HOI, we devise the very first model that leverage human pose estimation to tackle the estimation of articulated object poses and shapes during whole-body interactions. Given an image and an estimated human pose, our model first reconstructs the pose and shape of the object, then optimizes the reconstruction according to a learned interaction prior. Under both evaluation settings (e.g., with or without the knowledge of objects' geometries/structures), our model significantly outperforms baselines. We hope CHAIRS will promote the community towards finer-grained interaction understanding. We will make the data/code publicly available.
translated by 谷歌翻译
Dense retrieval aims to map queries and passages into low-dimensional vector space for efficient similarity measuring, showing promising effectiveness in various large-scale retrieval tasks. Since most existing methods commonly adopt pre-trained Transformers (e.g. BERT) for parameter initialization, some work focuses on proposing new pre-training tasks for compressing the useful semantic information from passages into dense vectors, achieving remarkable performances. However, it is still challenging to effectively capture the rich semantic information and relations about passages into the dense vectors via one single particular pre-training task. In this work, we propose a multi-task pre-trained model, MASTER, that unifies and integrates multiple pre-training tasks with different learning objectives under the bottlenecked masked autoencoder architecture. Concretely, MASTER utilizes a multi-decoder architecture to integrate three types of pre-training tasks: corrupted passages recovering, related passage recovering and PLMs outputs recovering. By incorporating a shared deep encoder, we construct a representation bottleneck in our architecture, compressing the abundant semantic information across tasks into dense vectors. The first two types of tasks concentrate on capturing the semantic information of passages and relationships among them within the pre-training corpus. The third one can capture the knowledge beyond the corpus from external PLMs (e.g. GPT-2). Extensive experiments on several large-scale passage retrieval datasets have shown that our approach outperforms the previous state-of-the-art dense retrieval methods. Our code and data are publicly released in https://github.com/microsoft/SimXNS
translated by 谷歌翻译
Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model. Traditional knowledge distillation methods include response-based methods and feature-based methods. Response-based methods are used the most widely but suffer from lower upper limit of model performance, while feature-based methods have constraints on the vocabularies and tokenizers. In this paper, we propose a tokenizer-free method liberal feature-based distillation (LEAD). LEAD aligns the distribution between teacher model and student model, which is effective, extendable, portable and has no requirements on vocabularies, tokenizer, or model architecture. Extensive experiments show the effectiveness of LEAD on several widely-used benchmarks, including MS MARCO Passage, TREC Passage 19, TREC Passage 20, MS MARCO Document, TREC Document 19 and TREC Document 20.
translated by 谷歌翻译
Time-critical control applications typically pose stringent connectivity requirements for communication networks. The imperfections associated with the wireless medium such as packet losses, synchronization errors, and varying delays have a detrimental effect on performance of real-time control, often with safety implications. This paper introduces multi-service edge-intelligence as a new paradigm for realizing time-critical control over wireless. It presents the concept of multi-service edge-intelligence which revolves around tight integration of wireless access, edge-computing and machine learning techniques, in order to provide stability guarantees under wireless imperfections. The paper articulates some of the key system design aspects of multi-service edge-intelligence. It also presents a temporal-adaptive prediction technique to cope with dynamically changing wireless environments. It provides performance results in a robotic teleoperation scenario. Finally, it discusses some open research and design challenges for multi-service edge-intelligence.
translated by 谷歌翻译
This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks. The model is trained in a contrastive manner with weak supervision signals from our curated large-scale text pair dataset (called CCPairs). E5 can be readily used as a general-purpose embedding model for any tasks requiring a single-vector representation of texts such as retrieval, clustering, and classification, achieving strong performance in both zero-shot and fine-tuned settings. We conduct extensive evaluations on 56 datasets from the BEIR and MTEB benchmarks. For zero-shot settings, E5 is the first model that outperforms the strong BM25 baseline on the BEIR retrieval benchmark without using any labeled data. When fine-tuned, E5 obtains the best results on the MTEB benchmark, beating existing embedding models with 40x more parameters.
translated by 谷歌翻译
Generative models for learning combinatorial structures have transformative impacts in many applications. However, existing approaches fail to offer efficient and accurate learning results. Because of the highly intractable nature of the gradient estimation of the learning objective subject to combinatorial constraints. Existing gradient estimation methods would easily run into exponential time/memory space, or incur huge estimation errors due to improper approximation. We develop NEural Lovasz Sampler (Nelson), a neural network based on Lov\'asz Local Lemma (LLL). We show it guarantees to generate samples satisfying combinatorial constraints from the distribution of the constrained Markov Random Fields model (MRF) under certain conditions. We further present a fully differentiable contrastive-divergence-based learning framework on constrained MRF (Nelson-CD). Meanwhile, Nelson-CD being fully differentiable allows us to take advantage of the parallel computing power of GPUs, resulting in great efficiency. Experimental results on three real-world combinatorial problems reveal that Nelson learns to generate 100% valid structures. In comparison, baselines either time out on large-size data sets or fail to generate valid structures, whereas Nelson scales much better with problem size. In addition, Nelson outperforms baselines in various learning metrics, such as log-likelihood and MAP scores.
translated by 谷歌翻译
We propose a new model-based offline RL framework, called Adversarial Models for Offline Reinforcement Learning (ARMOR), which can robustly learn policies to improve upon an arbitrary baseline policy regardless of data coverage. Based on the concept of relative pessimism, ARMOR is designed to optimize for the worst-case relative performance when facing uncertainty. In theory, we prove that the learned policy of ARMOR never degrades the performance of the baseline policy with any admissible hyperparameter, and can learn to compete with the best policy within data coverage when the hyperparameter is well tuned, and the baseline policy is supported by the data. Such a robust policy improvement property makes ARMOR especially suitable for building real-world learning systems, because in practice ensuring no performance degradation is imperative before considering any benefit learning can bring.
translated by 谷歌翻译
知识蒸馏是将知识从强大的教师转移到有效的学生模型的有效方法。理想情况下,我们希望老师越好,学生越好。但是,这种期望并不总是成真。通常,由于教师和学生之间的不可忽略的差距,更好的教师模型通过蒸馏导致不良学生。为了弥合差距,我们提出了一种渐进式蒸馏方法,以进行致密检索。产品由教师渐进式蒸馏和数据进行渐进的蒸馏组成,以逐步改善学生。我们对五个广泛使用的基准,MARCO通道,TREC Passage 19,TREC文档19,MARCO文档和自然问题进行了广泛的实验,其中POD在蒸馏方法中实现了密集检索的最新方法。代码和模型将发布。
translated by 谷歌翻译
我们研究了具有一般函数近似的部分可观察的MDP(POMDP)的外部评估(OPE)。现有的方法,例如顺序重要性采样估计器和拟合-Q评估,受POMDP中的地平线的诅咒。为了解决这个问题,我们通过引入将未来代理作为输入的未来依赖性值函数来开发一种新颖的无模型OPE方法。未来依赖性的价值函数在完全可观察的MDP中起着与经典价值函数相似的角色。我们为未来依赖性价值作为条件矩方程提供了一个新的Bellman方程,将历史记录代理用作仪器变量。我们进一步提出了一种最小值学习方法,以使用新的Bellman方程来学习未来依赖的价值函数。我们获得PAC结果,这意味着我们的OPE估计器是一致的,只要期货和历史包含有关潜在状态和Bellman完整性的足够信息。最后,我们将方法扩展到学习动力学,并在POMDP中建立我们的方法与众所周知的光谱学习方法之间的联系。
translated by 谷歌翻译
当前的论文研究在仅假定最佳值函数可线化的设置中,在设置中样本效率增强学习(RL)。最近已经理解,即使在这种看似强大的假设和对生成模型的访问下,最坏情况的样本复杂性也可能是庞大的(即指数)。我们研究了学习者还可以从专家政策中访问交互式演示的设置,并提出一种统计和计算上有效的算法(DELPHI),用于将探索与专家查询融合。特别是,Delphi需要$ \ tilde {\ Mathcal {o}}(d)$专家查询和$ \ texttt {poly} {poly}(d,h,h,| \ mathcal {a} |,1/\ varepsilon)$探索性样本可证明恢复$ \ varepsilon $ -suboptimal策略。与纯RL方法相比,这对应于样品复杂性的指数改善,而专家输入令人惊讶。与先前的模仿学习(IL)方法相比,我们所需的专家演示数量独立于$ h $和$ 1/\ varepsilon $的对数,而所有先前的工作至少需要两者的线性因素,除了对$ $ $ $的依赖性外, D $。为了确定所需的专家查询数量最少,我们表明,在同一环境中,任何其勘探预算是多项式限制的学习者(在$ d,h,$和$ | \ | \ MATHCAL {a} | $方面,需要至少$ \ tilde \ omega(\ sqrt {d})$ oracle调用以恢复与专家的价值函数竞争的策略。在较弱的假设中,专家的政策是线性的,我们表明下限将增加到$ \ tilde \ omega(d)$。
translated by 谷歌翻译