Answering complex logical queries on incomplete knowledge graphs is a challenging task, and has been widely studied. Embedding-based methods require training on complex queries, and cannot generalize well to out-of-distribution query structures. Recent work frames this task as an end-to-end optimization problem, and it only requires a pretrained link predictor. However, due to the exponentially large combinatorial search space, the optimal solution can only be approximated, limiting the final accuracy. In this work, we propose QTO (Query Tree Optimization) that can efficiently find the exact optimal solution. QTO finds the optimal solution by a forward-backward propagation on the tree-like computation graph, i.e., query tree. In particular, QTO utilizes the independence encoded in the query tree to reduce the search space, where only local computations are involved during the optimization procedure. Experiments on 3 datasets show that QTO obtains state-of-the-art performance on complex query answering, outperforming previous best results by an average of 22%. Moreover, QTO can interpret the intermediate solutions for each of the one-hop atoms in the query with over 90% accuracy.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
translated by 谷歌翻译
多对象跟踪(MOT)是最基本的计算机视觉任务之一,它有助于各种视频分析应用程序。尽管最近取得了有希望的进展,但当前的MOT研究仍仅限于输入流的固定采样帧速率。实际上,我们从经验上发现,当输入帧速率变化时,所有最新最新跟踪器的准确性都会急剧下降。对于更智能的跟踪解决方案,我们将研究工作的注意力转移到了帧速率不可知MOT(FRAMOT)的问题上。在本文中,我们建议使用定期培训计划(FAPS)的帧速率不可知的MOT框架,以首次解决FRAMOT问题。具体而言,我们提出了一个帧速率不可知协会模块(FAAM),该模块(FAAM)渗透并编码帧速率信息,以帮助跨多帧速率输入的身份匹配,从而提高了学习模型在处理FRAMOT中复杂的运动体验关系方面的能力。此外,FRAMOT中训练和推理之间的关联差距扩大,因为训练中未包含的那些后处理步骤在较低的帧速率方案中产生了更大的影响。为了解决这个问题,我们建议定期培训计划(PTS),以通过跟踪模式匹配和融合来反映培训中的所有后处理步骤。除了提出的方法外,我们首次尝试以两种不同的模式(即已知的帧速率和未知帧速率)建立这项新任务的评估方法,旨在处理更复杂的情况。在具有挑战性的MOT数据集(FRAMOT版本)上进行的定量实验清楚地表明,所提出的方法可以更好地处理不同的帧速率,从而改善对复杂情况的鲁棒性。
translated by 谷歌翻译
道路网络和轨迹表示学习对于交通系统至关重要,因为学习的表示形式可以直接用于各种下游任务(例如,交通速度推理和旅行时间估计)。但是,大多数现有方法仅在同一规模内对比,即分别处理道路网络和轨迹,这些方法忽略了有价值的相互关系。在本文中,我们旨在提出一个统一的框架,该框架共同学习道路网络和轨迹表示端到端。我们为公路对比度和轨迹 - 轨迹对比度分别设计了特定领域的增强功能,即路段及其上下文邻居和轨迹分别替换和丢弃了替代方案。最重要的是,我们进一步引入了路面跨尺度对比,与最大化总互信息桥接了这两个尺度。与仅在形成对比的图形及其归属节点上的现有跨尺度对比度学习方法不同,路段和轨迹之间的对比是通过新颖的正面抽样和适应性加权策略精心量身定制的。我们基于两个实际数据集进行了审慎的实验,这些数据集具有四个下游任务,证明了性能和有效性的提高。该代码可在https://github.com/mzy94/jclrnt上找到。
translated by 谷歌翻译
整合多个在线社交网络(OSN)对许多下游社交挖掘任务(例如用户偏好建模,建议和链接预测)具有重要意义。但是,不幸的是,伴随着越来越多的隐私问题,泄漏敏感用户信息。如何完全利用来自不同在线社交网络的数据,同时保存用户隐私仍然无法解决。为此,我们提出了一个跨网络的社交用户嵌入框架,即DP-Crosue,以一种隐私性的方式学习用户的全面表示。我们共同考虑具有不同隐私保证的部分调整社交网络的信息。特别是,对于每个异质社交网络,我们首先引入一个混合差异隐私概念,以捕获异构数据类型的隐私期望的变化。接下来,为了找到跨社交网络的用户链接,我们进行了无监督的基于用户嵌入的对齐方式,其中通过异质网络嵌入技术实现了用户嵌入。为了进一步增强用户嵌入,一种新颖的跨网络GCN嵌入模型旨在通过那些对齐用户跨网络传输知识。在三个现实世界数据集上进行的广泛实验表明,我们的方法对用户兴趣预测任务以及捍卫用户属性推理攻击的嵌入进行了重大改进。
translated by 谷歌翻译
本文提出了Salenet-端到端卷积神经网络(CNN),用于使用前额叶脑电图(EEG)进行持续注意水平评估。提出了一种偏置驱动的修剪方法,以及小组卷积,全局平均池(GAP),接近零的修剪,重量聚类和模型压缩的量化,达到183.11x的总压缩比。在这项工作中,压缩的分配器在记录的6个受试者EEG数据库上获得了最新的主题无关的持续注意力分类精度为84.2%。该沙发在ARTIX-7 FPGA上实施,竞争功耗为0.11 W,能源效率为8.19 GOPS/W。
translated by 谷歌翻译
基于图像的3D检测是自主驾驶感知系统的必不可少的组成部分。但是,它仍然受到不满意的表现,这是有限的培训数据的主要原因之一。不幸的是,在3D空间中注释对象是极度时间/资源消耗的,这使得很难任意扩展训练集。在这项工作中,我们专注于半监督的方式,并探索更便宜的替代方案(即伪标记)的可行性,以利用未标记的数据。为此,我们进行了广泛的实验,以研究伪标签是否可以在不同环境下为基线模型提供有效的监督。实验结果不仅证明了基于图像的3D检测的伪标记机制的有效性(例如,在单眼设置下,我们在没有铃铛和哨声的Kitti-3D测试集上实现了20.23 AP,用于中等水平,从6.03 AP),但还显示了几个有趣且值得注意的发现(例如,经过伪标签训练的模型的性能要比基于相同培训数据的地面真相注释训练的表现更好)。我们希望这项工作可以在半监督环境下为基于图像的3D检测社区提供见解。代码,伪标签和预培训模型将公开可用。
translated by 谷歌翻译
近年来,基于对比的自我监督学习方法取得了巨大的成功。但是,自学要求非常长的训练时期(例如,MoCO V3的800个时代)才能获得有希望的结果,这对于一般学术界来说是不可接受的,并阻碍了该主题的发展。这项工作重新审视了基于动量的对比学习框架,并确定了两种增强观点仅产生一个积极对的效率低下。我们提出了快速MOCO-一个新颖的框架,该框架利用组合贴片从两个增强视图中构造了多对正面,该视图提供了丰富的监督信号,这些信号带来了可忽视的额外计算成本,从而带来了显着的加速。经过100个时期训练的快速MOCO实现了73.5%的线性评估精度,类似于经过800个时期训练的MOCO V3(Resnet-50骨干)。额外的训练(200个时期)进一步将结果提高到75.1%,这与最先进的方法相当。几个下游任务的实验也证实了快速MOCO的有效性。
translated by 谷歌翻译
Twitter机器人检测已成为打击错误信息,促进社交媒体节制并保持在线话语的完整性的越来越重要的任务。最先进的机器人检测方法通常利用Twitter网络的图形结构,在面对传统方法无法检测到的新型Twitter机器人时,它们表现出令人鼓舞的性能。但是,现有的Twitter机器人检测数据集很少是基于图形的,即使这些基于图形的数据集也遭受有限的数据集量表,不完整的图形结构以及低注释质量。实际上,缺乏解决这些问题的大规模基于图的Twitter机器人检测基准,严重阻碍了基于图形的机器人检测方法的开发和评估。在本文中,我们提出了Twibot-22,这是一个综合基于图的Twitter机器人检测基准,它显示了迄今为止最大的数据集,在Twitter网络上提供了多元化的实体和关系,并且与现有数据集相比具有更好的注释质量。此外,我们重新实施35代表性的Twitter机器人检测基线,并在包括Twibot-22在内的9个数据集上进行评估,以促进对模型性能和对研究进度的整体了解的公平比较。为了促进进一步的研究,我们将所有实施的代码和数据集巩固到Twibot-22评估框架中,研究人员可以在其中始终如一地评估新的模型和数据集。 Twibot-22 Twitter机器人检测基准和评估框架可在https://twibot22.github.io/上公开获得。
translated by 谷歌翻译