Different types of mental rotation tests have been used extensively in psychology to understand human visual reasoning and perception. Understanding what an object or visual scene would look like from another viewpoint is a challenging problem that is made even harder if it must be performed from a single image. We explore a controlled setting whereby questions are posed about the properties of a scene if that scene was observed from another viewpoint. To do this we have created a new version of the CLEVR dataset that we call CLEVR Mental Rotation Tests (CLEVR-MRT). Using CLEVR-MRT we examine standard methods, show how they fall short, then explore novel neural architectures that involve inferring volumetric representations of a scene. These volumes can be manipulated via camera-conditioned transformations to answer the question. We examine the efficacy of different model variants through rigorous ablations and demonstrate the efficacy of volumetric representations.
translated by 谷歌翻译
不依赖虚假相关性的学习预测因素涉及建立因果关系。但是,学习这样的表示非常具有挑战性。因此,我们制定了从高维数据中学习因果表示的问题,并通过合成数据研究因果恢复。这项工作引入了贝叶斯因果发现的潜在变量解码器模型BCD,并在轻度监督和无监督的环境中进行实验。我们提出了一系列合成实验,以表征因果发现的重要因素,并表明将已知的干预靶标用作标签有助于无监督的贝叶斯推断,对线性高斯添加噪声潜在结构性因果模型的结构和参数。
translated by 谷歌翻译
在本文中,我们探讨了基于GAN的少量数据增强用作改善少量分类性能的方法。我们对如何对这样的任务进行微调(其中一项是以课堂开采方式)进行微调的探索,以及对这些模型如何在改善几次分类的情况下进行严格的经验研究。我们确定了与纯粹有监督的制度训练此类生成模型的困难有关的问题,几乎没有例子,以及有关现有作品的评估协议的问题。我们还发现,在这种制度中,分类精度对数据集的类别随机分配方式高度敏感。因此,我们提出了一种半监督的微调方法,作为解决这些问题的更务实的方向。
translated by 谷歌翻译
我们对学习协调的互动代理感兴趣,即$ BUILDER $ - 执行操作但忽略任务的目标 - 以及$架构师$指导建造者以朝着任务的目标指导。我们定义和探索正式的设置,其中人工代理配备了允许它们同时学习任务的机制,同时同时演变共享通信协议。实验符号学领域表明,从先验的未知指示中学习的人类熟练程度。因此,我们从中获取灵感并提出了建筑师构建器问题(ABP):一个不对称的设置,其中建筑师必须学习指导建设者朝构建特定结构。该架构师知道目标结构,但不能在环境中行动,只能向构建器发送任意消息。另一方面的建筑师可以在环境中采取行动,但没有关于手头的任务的知识,必须学会解决它依赖于架构师发送的消息。至关重要的是,消息的含义最初没有在代理商之间定义,而是必须在整个学习中进行协商。在这些约束下,我们建议建筑师构建器迭代(abig),一个解决方案到架构师 - 建筑师的问题,其中建筑师利用Builder的学习模型指导它,同时构建器使用自模仿学习来加强其导游行为。我们分析ABIG的关键学习机制,并在ABP的二维实例化中测试,其中任务涉及抓取立方体,将它们放在给定位置或构建各种形状。在这种环境中,ABIG导致低级,高频,指导通信协议,不仅使建筑师构建器对能够在手头上解决任务,而且还可以概括到未操作任务。
translated by 谷歌翻译
Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. State-of-theart methods typically encode the SDF with a large, fixedsize neural network to approximate complex shapes with implicit surfaces. Rendering with these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making these representations impractical for real-time graphics. We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality. We represent implicit surfaces using an octree-based feature volume which adaptively fits shapes with multiple discrete levels of detail (LODs), and enables continuous LOD with SDF interpolation. We further develop an efficient algorithm to directly render our novel neural SDF representation in real-time by querying only the necessary LODs with sparse octree traversal. We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works. Furthermore, it produces state-of-the-art reconstruction quality for complex shapes under both 3D geometric and 2D image-space metrics.
translated by 谷歌翻译
Artificial intelligence (AI) in the form of deep learning bears promise for drug discovery and chemical biology, $\textit{e.g.}$, to predict protein structure and molecular bioactivity, plan organic synthesis, and design molecules $\textit{de novo}$. While most of the deep learning efforts in drug discovery have focused on ligand-based approaches, structure-based drug discovery has the potential to tackle unsolved challenges, such as affinity prediction for unexplored protein targets, binding-mechanism elucidation, and the rationalization of related chemical kinetic properties. Advances in deep learning methodologies and the availability of accurate predictions for protein tertiary structure advocate for a $\textit{renaissance}$ in structure-based approaches for drug discovery guided by AI. This review summarizes the most prominent algorithmic concepts in structure-based deep learning for drug discovery, and forecasts opportunities, applications, and challenges ahead.
translated by 谷歌翻译
Managing novelty in perception-based human activity recognition (HAR) is critical in realistic settings to improve task performance over time and ensure solution generalization outside of prior seen samples. Novelty manifests in HAR as unseen samples, activities, objects, environments, and sensor changes, among other ways. Novelty may be task-relevant, such as a new class or new features, or task-irrelevant resulting in nuisance novelty, such as never before seen noise, blur, or distorted video recordings. To perform HAR optimally, algorithmic solutions must be tolerant to nuisance novelty, and learn over time in the face of novelty. This paper 1) formalizes the definition of novelty in HAR building upon the prior definition of novelty in classification tasks, 2) proposes an incremental open world learning (OWL) protocol and applies it to the Kinetics datasets to generate a new benchmark KOWL-718, 3) analyzes the performance of current state-of-the-art HAR models when novelty is introduced over time, 4) provides a containerized and packaged pipeline for reproducing the OWL protocol and for modifying for any future updates to Kinetics. The experimental analysis includes an ablation study of how the different models perform under various conditions as annotated by Kinetics-AVA. The protocol as an algorithm for reproducing experiments using the KOWL-718 benchmark will be publicly released with code and containers at https://github.com/prijatelj/human-activity-recognition-in-an-open-world. The code may be used to analyze different annotations and subsets of the Kinetics datasets in an incremental open world fashion, as well as be extended as further updates to Kinetics are released.
translated by 谷歌翻译
Relation Extraction (RE) has been extended to cross-document scenarios because many relations are not simply described in a single document. This inevitably brings the challenge of efficient open-space evidence retrieval to support the inference of cross-document relations, along with the challenge of multi-hop reasoning on top of entities and evidence scattered in an open set of documents. To combat these challenges, we propose Mr.CoD, a multi-hop evidence retrieval method based on evidence path mining and ranking with adapted dense retrievers. We explore multiple variants of retrievers to show evidence retrieval is an essential part in cross-document RE. Experiments on CodRED show that evidence retrieval with Mr.Cod effectively acquires cross-document evidence that essentially supports open-setting cross-document RE. Additionally, we show that Mr.CoD facilitates evidence retrieval and boosts end-to-end RE performance with effective multi-hop reasoning in both closed and open settings of RE.
translated by 谷歌翻译
Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unknown cases but give a guess from seen relations, hindering the applicability of those approaches. We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. By converting relations to natural language hypotheses, NBR is capable of exploiting semantic cues to alleviate annotation scarcity. By incorporating a ranking-based loss that implicitly calibrates abstinent instances, NBR learns a clearer decision boundary and is instructed to abstain on uncertain instances. Extensive experiments on three widely-used biomedical RE benchmarks, namely ChemProt, DDI and GAD, verify the effectiveness of NBR in both full-set and low-resource regimes. Our analysis demonstrates that indirect supervision benefits biomedical RE even when a domain gap exists, and combining NLI knowledge with biomedical knowledge leads to the best performance gains.
translated by 谷歌翻译
The state-of-the-art language model-based automatic metrics, e.g. BARTScore, benefiting from large-scale contextualized pre-training, have been successfully used in a wide range of natural language generation (NLG) tasks, including machine translation, text summarization, and data-to-text. Recent studies show that considering both major errors (e.g. mistranslated tokens) and minor errors (e.g. imperfections in fluency) can produce high-quality human judgments. This inspires us to approach the final goal of the evaluation metrics (human-like evaluations) by automatic error analysis. To this end, we augment BARTScore by incorporating the human-like error analysis strategies, namely BARTScore++, where the final score consists of both the evaluations of major errors and minor errors. Experimental results show that BARTScore++ can consistently improve the performance of vanilla BARTScore and outperform existing top-scoring metrics in 20 out of 25 test settings. We hope our technique can also be extended to other pre-trained model-based metrics. We will release our code and scripts to facilitate the community.
translated by 谷歌翻译