多模式演示为机器人提供了大量信息,以使世界有意义。但是,当从人类示威中学习感觉运动控制政策时,这种丰度可能并不总是会导致良好的表现。无关的数据模式可能导致状态过度规格,在该状态中包含的模式不仅可以在决策中无用,而且可以改变跨环境的数据分布。州过度规格会导致诸如学习的政策之类的问题,而不是在培训数据分布之外推广。在这项工作中,我们提出了掩盖的模仿学习(MIL),以选择性地使用信息方式来解决状态过度指定。具体来说,我们设计了带有二进制掩码的蒙版策略网络,以阻止某些方式。我们开发了一种双层优化算法,该算法可以学习此面具以准确过滤过度指定的模态。我们从经验上证明,使用Robomimic数据集在包括Mujoco和机器人ARM环境在内的模拟域中的基线算法均优于基线算法,并有效地在收集在真实机器人上收集的多模式数据集中有效地恢复了环境不变的模式。我们的项目网站在以下网址介绍了我们的结果的补充详细信息和视频:https://tinyurl.com/masked-il
translated by 谷歌翻译
转移学习旨在利用预先培训模型的知识来受益。先前的转移学习工作主要是从单个模型转移。但是,随着从不同资源预先训练的深层模型的出现,由具有各种体系结构的各种模型组成的模型中心,预先训练的数据集和学习范式可用。直接将单模传输学习方法应用于每种模型,都会浪费对模型中心的丰富知识,并遭受高计算成本。在本文中,我们提出了一个枢纽 - 校园框架,以实现从模型中心的知识转移。该框架生成数据依赖性途径权重,基于我们在输入级别分配路径路由以确定激活哪些预训练模型并通过了哪些预训练的模型,然后在输出级别设置了途径聚集,以从不同做出预测的模型。可以通过针对特定于任务的损失端对端训练所提出的框架,在该损失中,它将学会探索更好的途径配置并利用每个目标基准的预训练模型中的知识。我们利用嘈杂的途径生成器并设计勘探损失,以进一步探索整个模型中心的不同途径。为了充分利用预训练模型中的知识,每个模型都会通过激活它的特定数据进一步培训,从而确保其性能并增强知识传递。计算机视觉和强化学习任务的实验结果表明,所提出的枢纽式框架实现了模型中心传输学习的最新性能。
translated by 谷歌翻译
Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focuses respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.
translated by 谷歌翻译
Adversarial learning has been embedded into deep networks to learn disentangled and transferable representations for domain adaptation. Existing adversarial domain adaptation methods may not effectively align different domains of multimodal distributions native in classification problems. In this paper, we present conditional adversarial domain adaptation, a principled framework that conditions the adversarial adaptation models on discriminative information conveyed in the classifier predictions. Conditional domain adversarial networks (CDANs) are designed with two novel conditioning strategies: multilinear conditioning that captures the crosscovariance between feature representations and classifier predictions to improve the discriminability, and entropy conditioning that controls the uncertainty of classifier predictions to guarantee the transferability. With theoretical guarantees and a few lines of codes, the approach has exceeded state-of-the-art results on five datasets.
translated by 谷歌翻译
Decompilation aims to transform a low-level program language (LPL) (eg., binary file) into its functionally-equivalent high-level program language (HPL) (e.g., C/C++). It is a core technology in software security, especially in vulnerability discovery and malware analysis. In recent years, with the successful application of neural machine translation (NMT) models in natural language processing (NLP), researchers have tried to build neural decompilers by borrowing the idea of NMT. They formulate the decompilation process as a translation problem between LPL and HPL, aiming to reduce the human cost required to develop decompilation tools and improve their generalizability. However, state-of-the-art learning-based decompilers do not cope well with compiler-optimized binaries. Since real-world binaries are mostly compiler-optimized, decompilers that do not consider optimized binaries have limited practical significance. In this paper, we propose a novel learning-based approach named NeurDP, that targets compiler-optimized binaries. NeurDP uses a graph neural network (GNN) model to convert LPL to an intermediate representation (IR), which bridges the gap between source code and optimized binary. We also design an Optimized Translation Unit (OTU) to split functions into smaller code fragments for better translation performance. Evaluation results on datasets containing various types of statements show that NeurDP can decompile optimized binaries with 45.21% higher accuracy than state-of-the-art neural decompilation frameworks.
translated by 谷歌翻译
Nearest-Neighbor (NN) classification has been proven as a simple and effective approach for few-shot learning. The query data can be classified efficiently by finding the nearest support class based on features extracted by pretrained deep models. However, NN-based methods are sensitive to the data distribution and may produce false prediction if the samples in the support set happen to lie around the distribution boundary of different classes. To solve this issue, we present P3DC-Shot, an improved nearest-neighbor based few-shot classification method empowered by prior-driven data calibration. Inspired by the distribution calibration technique which utilizes the distribution or statistics of the base classes to calibrate the data for few-shot tasks, we propose a novel discrete data calibration operation which is more suitable for NN-based few-shot classification. Specifically, we treat the prototypes representing each base class as priors and calibrate each support data based on its similarity to different base prototypes. Then, we perform NN classification using these discretely calibrated support data. Results from extensive experiments on various datasets show our efficient non-learning based method can outperform or at least comparable to SOTA methods which need additional learning steps.
translated by 谷歌翻译
In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem.
translated by 谷歌翻译
In contrast to the control-theoretic methods, the lack of stability guarantee remains a significant problem for model-free reinforcement learning (RL) methods. Jointly learning a policy and a Lyapunov function has recently become a promising approach to ensuring the whole system with a stability guarantee. However, the classical Lyapunov constraints researchers introduced cannot stabilize the system during the sampling-based optimization. Therefore, we propose the Adaptive Stability Certification (ASC), making the system reach sampling-based stability. Because the ASC condition can search for the optimal policy heuristically, we design the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm based on the ASC condition. Meanwhile, our algorithm avoids the optimization problem that a variety of constraints are coupled into the objective in current approaches. When evaluated on ten robotic tasks, our method achieves lower accumulated cost and fewer stability constraint violations than previous studies.
translated by 谷歌翻译
The surrogate loss of variational autoencoders (VAEs) poses various challenges to their training, inducing the imbalance between task fitting and representation inference. To avert this, the existing strategies for VAEs focus on adjusting the tradeoff by introducing hyperparameters, deriving a tighter bound under some mild assumptions, or decomposing the loss components per certain neural settings. VAEs still suffer from uncertain tradeoff learning.We propose a novel evolutionary variational autoencoder (eVAE) building on the variational information bottleneck (VIB) theory and integrative evolutionary neural learning. eVAE integrates a variational genetic algorithm into VAE with variational evolutionary operators including variational mutation, crossover, and evolution. Its inner-outer-joint training mechanism synergistically and dynamically generates and updates the uncertain tradeoff learning in the evidence lower bound (ELBO) without additional constraints. Apart from learning a lossy compression and representation of data under the VIB assumption, eVAE presents an evolutionary paradigm to tune critical factors of VAEs and deep neural networks and addresses the premature convergence and random search problem by integrating evolutionary optimization into deep learning. Experiments show that eVAE addresses the KL-vanishing problem for text generation with low reconstruction loss, generates all disentangled factors with sharp images, and improves the image generation quality,respectively. eVAE achieves better reconstruction loss, disentanglement, and generation-inference balance than its competitors.
translated by 谷歌翻译
A storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards however remains challenging which not only requires association between high-level texts and images, but also demands for long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text synopsis to Video Storyboard (TeViS) which aims to retrieve an ordered sequence of images to visualize the text synopsis. We construct a MovieNet-TeViS benchmark based on the public MovieNet dataset. It contains 10K text synopses each paired with keyframes that are manually selected from corresponding movies by considering both relevance and cinematic coherence. We also present an encoder-decoder baseline for the task. The model uses a pretrained vision-and-language model to improve high-level text-image matching. To improve coherence in long-term shots, we further propose to pre-train the decoder on large-scale movie frames without text. Experimental results demonstrate that our proposed model significantly outperforms other models to create text-relevant and coherent storyboards. Nevertheless, there is still a large gap compared to human performance suggesting room for promising future work.
translated by 谷歌翻译