Graph Neural Networks (GNNs), originally proposed for node classification, have also motivated many recent works on edge prediction (a.k.a., link prediction). However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i.e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes. To this end, we propose a novel edge prediction paradigm named Edge-aware Message PassIng neuRal nEtworks (EMPIRE). Concretely, we first introduce an edge splitting technique to specify use of each edge where each edge is solely used as either the topology or the supervision (named as topology edge or supervision edge). We then develop a new message passing mechanism that generates the messages to source nodes (through topology edges) being aware of target nodes (through supervision edges). In order to emphasize the differences between pairs connected by supervision edges and pairs unconnected, we further weight the messages to highlight the relative ones that can reflect the differences. In addition, we design a novel negative node-pair sampling trick that efficiently samples 'hard' negative instances in the supervision instances, and can significantly improve the performance. Experimental results verify that the proposed method can significantly outperform existing state-of-the-art models regarding the edge prediction task on multiple homogeneous and heterogeneous graph datasets.
translated by 谷歌翻译
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
translated by 谷歌翻译
解释深度卷积神经网络最近引起了人们的关注,因为它有助于了解网络的内部操作以及为什么它们做出某些决定。显着地图强调了与网络决策的主要连接的显着区域,是可视化和分析计算机视觉社区深层网络的最常见方法之一。但是,由于未经证实的激活图权重的建议,这些图像没有稳固的理论基础,并且未能考虑每个像素之间的关系,因此现有方法生成的显着图不能表示图像中的真实信息。在本文中,我们开发了一种基于类激活映射的新型事后视觉解释方法,称为Shap-Cam。与以前的基于梯度的方法不同,Shap-Cam通过通过Shapley值获得每个像素的重要性来摆脱对梯度的依赖。我们证明,Shap-Cam可以在解释决策过程中获得更好的视觉性能和公平性。我们的方法在识别和本地化任务方面的表现优于以前的方法。
translated by 谷歌翻译
对表格数据的预测是许多重要的下游任务中的必要和基本问题。但是,现有方法要么将表的数据实例独立作为输入而独立使用,要么不完全利用多排功能和标签来直接更改和增强目标数据表示。在本文中,我们建议1)从相关数据实例检索中构建一个超图,以建模这些实例的跨行和跨柱模式,以及2)执行消息传播以增强目标数据实例表示表格预测任务。具体而言,我们专门设计的消息传播步骤受益于1)在传播过程中融合标签和特征,以及2)局部感知的高阶特征交互。在两个重要的表格数据预测任务上进行的实验验证了所提出的PET模型与其他基线的优越性。此外,我们证明了模型组件的有效性以及通过各种消融研究和可视化的PET的特征增强能力。该代码包含在https://github.com/kounianhuadu/pet中。
translated by 谷歌翻译
由于基础物理学的复杂性以及捕获中的复杂遮挡和照明,从稀疏多视频RGB视频中对流体的高保真重建仍然是一个巨大的挑战。现有的解决方案要么假设障碍和照明知识,要么仅专注于没有障碍物或复杂照明的简单流体场景,因此不适合具有未知照明或任意障碍的现实场景。我们提出了第一种通过从稀疏视频的端到端优化中利用管理物理(即,navier -stokes方程)来重建动态流体的第一种方法,而无需采取照明条件,几何信息或边界条件作为输入。我们使用神经网络作为流体的密度和速度解决方案函数以及静态对象的辐射场函数提供连续的时空场景表示。通过将静态和动态含量分开的混合体系结构,与静态障碍物的流体相互作用首次重建,而没有其他几何输入或人类标记。通过用物理知识的深度学习来增强随时间变化的神经辐射场,我们的方法受益于对图像和物理先验的监督。为了从稀疏视图中实现强大的优化,我们引入了逐层增长策略,以逐步提高网络容量。使用具有新的正则化项的逐步增长的模型,我们设法在不拟合的情况下解除了辐射场中的密度彩色歧义。在避免了次优速度之前,将预验证的密度到速度流体模型借用了,该数据低估了涡度,但可以微不足道地满足物理方程。我们的方法在一组代表性的合成和真实流动捕获方面表现出具有放松的约束和强大的灵活性的高质量结果。
translated by 谷歌翻译
本文旨在统一非欧几里得空间中的空间依赖性和时间依赖性,同时捕获流量数据的内部空间依赖性。对于具有拓扑结构的时空属性实体,时空是连续的和统一的,而每个节点的当前状态都受到每个邻居的变异时期的邻居的过去状态的影响。大多数用于流量预测研究的空间依赖性和时间相关性的空间神经网络在处理中分别损害了时空完整性,而忽略了邻居节点的时间依赖期可以延迟和动态的事实。为了建模这种实际条件,我们提出了一种新型的空间 - 周期性图神经网络,将空间和时间视为不可分割的整体,以挖掘时空图,同时通过消息传播机制利用每个节点的发展时空依赖性。进行消融和参数研究的实验已经验证了拟议的遍及术的有效性,并且可以从https://github.com/nnzhan/traversenet中找到详细的实现。
translated by 谷歌翻译
尽管近期图形神经网络(GNN)成功,但常见的架构通常表现出显着的限制,包括对过天飞机,远程依赖性和杂散边缘的敏感性,例如,由于图形异常或对抗性攻击。至少部分地解决了一个简单的透明框架内的这些问题,我们考虑了一个新的GNN层系列,旨在模仿和整合两个经典迭代算法的更新规则,即近端梯度下降和迭代重复最小二乘(IRLS)。前者定义了一个可扩展的基础GNN架构,其免受过性的,而仍然可以通过允许任意传播步骤捕获远程依赖性。相反,后者产生了一种新颖的注意机制,该注意机制被明确地锚定到底层端到端能量函数,以及相对于边缘不确定性的稳定性。当结合时,我们获得了一个非常简单而强大的模型,我们在包括标准化基准,与异常扰动的图形,具有异化的图形和涉及远程依赖性的图形的不同方案的极其简单而强大的模型。在此过程中,我们与已明确为各个任务设计的SOTA GNN方法进行比较,实现竞争或卓越的节点分类准确性。我们的代码可以在https://github.com/fftyyy/twirls获得。
translated by 谷歌翻译
Leveraging well-established MCMC strategies, we propose MCMC-interactive variational inference (MIVI) to not only estimate the posterior in a time constrained manner, but also facilitate the design of MCMC transitions. Constructing a variational distribution followed by a short Markov chain that has parameters to learn, MIVI takes advantage of the complementary properties of variational inference and MCMC to encourage mutual improvement. On one hand, with the variational distribution locating high posterior density regions, the Markov chain is optimized within the variational inference framework to efficiently target the posterior despite a small number of transitions. On the other hand, the optimized Markov chain with considerable flexibility guides the variational distribution towards the posterior and alleviates its underestimation of uncertainty. Furthermore, we prove the optimized Markov chain in MIVI admits extrapolation, which means its marginal distribution gets closer to the true posterior as the chain grows. Therefore, the Markov chain can be used separately as an efficient MCMC scheme. Experiments show that MIVI not only accurately and efficiently approximates the posteriors but also facilitates designs of stochastic gradient MCMC and Gibbs sampling transitions.
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译