人工智能通过许多令人印象深刻的应用深刻地彻底改变了药物化学领域,但是这些应用的成功需要大量具有高质量注释的培训样本,这严重限制了数据驱动方法的广泛使用。在本文中,我们专注于反应产量预测问题,该问题有助于化学家仅通过一些实验试验选择新的化学空间中的高收益反应。为了攻击这一挑战,我们首先提出了Metarf,这是一种基于注意力的随机森林模型,该模型专门针对少量产量预测,其中随机森林的注意力重量通过元学习框架自动优化,可以快速地进行优化适合预测新试剂的性能,同时还提供了一些其他样品。为了提高少量学习绩效,我们进一步引入了基于尺寸的采样方法,以确定要进行实验测试然后学习的有价值的样品。我们的方法在三个不同的数据集上进行了评估,并在几乎没有预测上获得了令人满意的性能。在高通量实验(HTE)数据集中,我们方法论的前10个高收益反应的平均产量相对接近理想的产量选择结果。
translated by 谷歌翻译
由于其低成本和快速移动性,无人驾驶汽车(UAV)现在已广泛应用于数据获取。随着航空视频量的增加,对这些视频自动解析的需求正在激增。为了实现这一目标,当前的研究主要集中于在空间和时间维度沿着卷积的整体特征提取整体特征。但是,这些方法受到小时接收场的限制,无法充分捕获长期的时间依赖性,这对于描述复杂动力学很重要。在本文中,我们提出了一个新颖的深神经网络,称为futh-net,不仅为整体特征建模,而且还模拟了空中视频分类的时间关系。此外,在新型融合模块中,多尺度的时间关系可以完善整体特征,以产生更具歧视性的视频表示。更特别地,FUTH-NET采用了两条道路架构:(1)学习框架外观和短期时间变化的一般特征的整体代表途径,以及(2)捕获跨任意跨越任意时间关系的时间关系途径框架,提供长期的时间依赖性。之后,提出了一个新型的融合模块,以时空整合从这两种途径中学到的两个特征。我们的模型对两个航空视频分类数据集进行了评估,即ERA和无人机操作,并实现了最新结果。这表明了其在不同识别任务(事件分类和人类行动识别)之间的有效性和良好的概括能力。为了促进进一步的研究,我们在https://gitlab.lrz.de/ai4eo/reasoning/futh-net上发布该代码。
translated by 谷歌翻译
开放式视频异常检测(OpenVAD)旨在从视频数据中识别出异常事件,在测试中都存在已知的异常和新颖的事件。无监督的模型仅从普通视频中学到的模型适用于任何测试异常,但遭受高误报率的损失。相比之下,弱监督的方法可有效检测已知的异常情况,但在开放世界中可能会失败。我们通过将证据深度学习(EDL)和将流量(NFS)归一化为多个实例学习(MIL)框架来开发出一种新颖的OpenVAD问题的弱监督方法。具体而言,我们建议使用图形神经网络和三重态损失来学习训练EDL分类器的区分特征,在该特征中,EDL能够通过量化不确定性来识别未知异常。此外,我们制定了一种不确定性感知的选择策略,以获取清洁异常实例和NFS模块以生成伪异常。我们的方法通过继承无监督的NF和弱监督的MIL框架的优势来优于现有方法。多个现实世界视频数据集的实验结果显示了我们方法的有效性。
translated by 谷歌翻译
人工智能,特别是通过深度学习的最新进步,在自然语言处理和计算机视觉等领域的许多任务中都取得了出色的表现。除了理想的评估指标外,这些模型通常需要高水平的解释性。因此,对模型将其输入映射到其输出的过程的解释是备受追捧的。不幸的是,机器学习模型的当前黑匣子性质仍然是一个尚未解决的问题,这种性质使研究人员无法学习并为模型的行为和最终预测提供阐释描述。在这项工作中,我们提出了一个利用对抗性逆强化学习的新颖框架,该框架可以为通过强化学习模型做出的决策提供全球解释,并捕获模型通过总结模型的决策过程所遵循的直觉趋势。
translated by 谷歌翻译
我们提出了一种新颖的框架,以便长期分类大型时间序列数据。长时间序号(L-TSC)是一个具有挑战性的问题,因为DATAOFTEN包含大量无关的信息到Checlassification目标。无关期限会降低分类的性能,而系统相关性是未知的。本文提出了一个不确定性,意识到多个型号(MIL)框架,以识别最相关的初期性。预测性的不确定性使得设计指示迫使MIL模型从判别判别期间学习的Anittent机制。此外,预测的不确定性yields是一个原则性的估计器,以识别是否是预测的istrustworthy。我们还通过培训在其可用性上​​进行单独培训并进行不确定性意识融合表的最终预测来培训另一个模型ToAcModate不可靠的预测。系统评估是进行自动识别系统(AIS)数据,该数据是识别和跟踪现实世界船只的COL。经验结果DemonstriteStriteStepriteStepritisht,该方法可以基于轨迹有效地检测血管的阀门,以及与其他可用数据模块(在我们的实验中使用的合成 - ApertureerAdar或SAR图像)的不确定性令人遗憾地进一步改善检测精度。
translated by 谷歌翻译
用于图像分类的最可公开的数据集是单个标签,而图像在我们的日常生活中是固有的多标记。这种注释差距使得许多预先接受的单标准分类模型在实际情况下失败。该注释问题更加关注空中图像:从传感器收集的空中数据自然地覆盖具有多个标签的相对大的陆地面积,而被广泛可用的注释空中数据集(例如,UCM,AID)是单标记的。作为手动注释的多标签空中图像将是时间/劳动,我们提出了一种新的自我校正综合域适应(SCIDA)方法,用于自动多标签学习。 SCIDA是弱监督,即,自动学习多标签图像分类模型,从使用大量的公共可用的单一标签图像。为实现这一目标,我们提出了一种新颖的标签 - 明智的自我校正(LWC)模块,以更好地探索潜在的标签相关性。该模块还使无监督的域适配(UDA)从单个到多标签数据中可能。对于模型培训,所提出的型号仅使用单一标签信息,但不需要先验知识的多标记数据;它预测了多标签空中图像的标签。在我们的实验中,用单标签的MAI-AID-S和MAI-UCM-S数据集接受培训,所提出的模型直接在收集的多场景空中图像(MAI)数据集上进行测试。
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译
Supervised Question Answering systems (QA systems) rely on domain-specific human-labeled data for training. Unsupervised QA systems generate their own question-answer training pairs, typically using secondary knowledge sources to achieve this outcome. Our approach (called PIE-QG) uses Open Information Extraction (OpenIE) to generate synthetic training questions from paraphrased passages and uses the question-answer pairs as training data for a language model for a state-of-the-art QA system based on BERT. Triples in the form of <subject, predicate, object> are extracted from each passage, and questions are formed with subjects (or objects) and predicates while objects (or subjects) are considered as answers. Experimenting on five extractive QA datasets demonstrates that our technique achieves on-par performance with existing state-of-the-art QA systems with the benefit of being trained on an order of magnitude fewer documents and without any recourse to external reference data sources.
translated by 谷歌翻译
Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the ImageNet pretrained weights significantly degrades while transferring the weights to medical image processing tasks. In this paper, we propose Bootstrap Own Latent of Transformer (BOLT), a self-supervised learning approach specifically for medical image classification with the Transformer backbone. Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning. Concretely, the online network is trained to predict the target network representation of the same patch embedding tokens with a different perturbation. To maximally excavate the impact of Transformer from limited medical data, we propose an auxiliary difficulty ranking task. The Transformer is enforced to identify which branch (i.e., online/target) is processing the more difficult perturbed tokens. Overall, the Transformer endeavours itself to distill the transformation-invariant features from the perturbed tokens to simultaneously achieve difficulty measurement and maintain the consistency of self-supervised representations. The proposed BOLT is evaluated on three medical image processing tasks, i.e., skin lesion classification, knee fatigue fracture grading and diabetic retinopathy grading. The experimental results validate the superiority of our BOLT for medical image classification, compared to ImageNet pretrained weights and state-of-the-art self-supervised learning approaches.
translated by 谷歌翻译
Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.
translated by 谷歌翻译