高信心重叠的预测和准确的对应关系对于以部分到派对方式对齐成对点云至关重要。但是,重叠区域和非重叠区域之间存在固有的不确定性,这些区域一直被忽略并显着影响注册绩效。除了当前的智慧之外,我们提出了一种新颖的不确定性意识到的重叠预测网络,称为Utopic,以解决模棱两可的重叠预测问题。据我们所知,这是第一个明确引入重叠不确定性以指向云注册的人。此外,我们诱导特征提取器通过完成解码器隐式感知形状知识,并为变压器提供几何关系嵌入,以获得转换 - 不变性的几何形状感知特征表示。凭借更可靠的重叠得分和更精确的密度对应关系的优点,即使对于有限的重叠区域的输入,乌托邦也可以实现稳定而准确的注册结果。关于合成和实际基准的广泛定量和定性实验证明了我们的方法优于最先进的方法。代码可从https://github.com/zhileichen99/utopic获得。
translated by 谷歌翻译
本文通过控制功能级别的RGB图像和深度图之间的消息,介绍了RGB-D显着对象检测的新型深神经网络框架,并探索有关RGB和深度特征的远程语义上下文和几何信息推断出明显的对象。为了实现这一目标,我们通过图神经网络和可变形的卷积制定动态消息传播(DMP)模块,以动态学习上下文信息,并自动预测消息传播控制的过滤权重和亲和力矩阵。我们将该模块进一步嵌入基于暹罗的网络中,分别处理RGB图像和深度图,并设计多级特征融合(MFF)模块,以探索精制的RGB和深度特征之间的跨级信息。与六个基准数据集上用于RGB-D显着对象检测的17种最先进的方法相比,实验结果表明,我们的方法在定量和视觉上都优于其他所有方法。
translated by 谷歌翻译
3D点云的卷积经过广泛研究,但在几何深度学习中却远非完美。卷积的传统智慧在3D点之间表现出特征对应关系,这是对差的独特特征学习的内在限制。在本文中,我们提出了自适应图卷积(AGCONV),以供点云分析的广泛应用。 AGCONV根据其动态学习的功能生成自适应核。与使用固定/各向同性核的解决方案相比,AGCONV提高了点云卷积的灵活性,有效,精确地捕获了不同语义部位的点之间的不同关系。与流行的注意力体重方案不同,AGCONV实现了卷积操作内部的适应性,而不是简单地将不同的权重分配给相邻点。广泛的评估清楚地表明,我们的方法优于各种基准数据集中的点云分类和分割的最新方法。同时,AGCONV可以灵活地采用更多的点云分析方法来提高其性能。为了验证其灵活性和有效性,我们探索了基于AGCONV的完成,DeNoing,Upsmpling,注册和圆圈提取的范式,它们与竞争对手相当甚至优越。我们的代码可在https://github.com/hrzhou2/adaptconv-master上找到。
translated by 谷歌翻译
An approach to evolutionary ensemble learning for classification is proposed in which boosting is used to construct a stack of programs. Each application of boosting identifies a single champion and a residual dataset, i.e. the training records that thus far were not correctly classified. The next program is only trained against the residual, with the process iterating until some maximum ensemble size or no further residual remains. Training against a residual dataset actively reduces the cost of training. Deploying the ensemble as a stack also means that only one classifier might be necessary to make a prediction, so improving interpretability. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms, while providing solutions that are orders of magnitude simpler. Further benchmarking with a high cardinality dataset indicates that the proposed method is also more accurate and efficient than XGBoost.
translated by 谷歌翻译
我们开发了一种新的方法来漂移游戏,一类两人游戏,其中包括许多应用程序来增强和在线学习设置,包括使用专家建议和对冲游戏的预测。我们的方法涉及(a)通过求解相关的部分微分方程(PDE)来猜测渐近的最佳潜力;然后(b)通过证明最终时间损失的上限和下限来证明猜测的合理性,它们的差异像个时间步数的负能力一样。我们潜在的基于上限的证据是基本的,只需使用泰勒的扩展。我们潜在的基于潜在的下限的证明也相当基本,将泰勒的扩展与概率或组合论证相结合。先前关于渐近最佳策略的大多数工作都使用了通过解决离散动态编程原理获得的潜力。这些论点因其离散性而变得复杂。我们使用的潜力是PDE的明确解决方案,这使我们的方法促进了我们的方法。这些论点基于基本的演算。我们的方法不仅更基本,而且还提供了新的电位,并得出相应的上和下限,这些上限和下限在渐近方面相互匹配。
translated by 谷歌翻译
我们考虑了一个批处理活动的方案,其中学习者可以适应地向标签Oracle发出批处理。由于具有标签Oracle(通常是人类)的互动次数较少,因此在批处理中的采样标签在实践中是非常可取的。但是,批处理主动学习通常会支付降低的适应性的价格,从而导致次优结果。在本文中,我们提出了一种解决方案,该解决方案需要在查询点的信息和多样性的信息之间进行仔细的权衡。我们从理论上研究了在实际相关的方案中研究批次的活动,其中未标记的数据库事先可用({\ em池基}主动学习)。我们分析了一种新颖的阶段贪婪算法,并表明,作为标签复杂性的函数,该算法的过量风险与标准统计学习环境中已知的最小值率相匹配。我们的结果还表现出对批处理大小的温和依赖。这些是在信息性和多样性之间进行仔细的交易来严格量化基于池的情况下批处理主动学习的统计表现的第一个理论结果。
translated by 谷歌翻译
Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译
Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.
translated by 谷歌翻译