In contrast to fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of simple box annotations, which has recently attracted increasing research attention. This paper presents a novel single-shot instance segmentation approach, namely Box2Mask, which integrates the classical level-set evolution model into deep neural network learning to achieve accurate mask prediction with only bounding box supervision. Specifically, both the input image and its deep features are employed to evolve the level-set curves implicitly, and a local consistency module based on a pixel affinity kernel is used to mine the local context and spatial relations. Two types of single-stage frameworks, i.e., CNN-based and transformer-based frameworks, are developed to empower the level-set evolution for box-supervised instance segmentation, and each framework consists of three essential components: instance-aware decoder, box-level matching assignment and level-set evolution. By minimizing the level-set energy function, the mask map of each instance can be iteratively optimized within its bounding box annotation. The experimental results on five challenging testbeds, covering general scenes, remote sensing, medical and scene text images, demonstrate the outstanding performance of our proposed Box2Mask approach for box-supervised instance segmentation. In particular, with the Swin-Transformer large backbone, our Box2Mask obtains 42.4% mask AP on COCO, which is on par with the recently developed fully mask-supervised methods. The code is available at: https://github.com/LiWentomng/boxlevelset.
translated by 谷歌翻译
在本文中,我们介绍了计算机视觉研讨会上的女性 - WICV 2022,与路易斯安那州新奥尔良的混合CVPR 2022一起组织。它为计算机视觉社区中的少数(女性)群体提供了声音,并着重于提高这些研究人员在学术界和工业中的可见性。 WICV认为,这样的事件可以在降低计算机视觉领域的性别失衡方面发挥重要作用。 WICV每年都会组织a)a)从少数群体的研究人员之间合作的机会,b)指导女性初级研究人员,c)向演示者提供财政支持,以克服货币负担,D)榜样的大量选择,他们可以在职业生涯开始时,是年轻研究人员的例子。在本文中,我们介绍了有关研讨会计划的报告,过去几年的趋势,关于WICV 2022讲习班的演示者,与会者和赞助的统计摘要。
translated by 谷歌翻译
与使用像素面罩标签的完全监督的方法相反,盒子监督实例细分利用了简单的盒子注释,该盒子注释最近吸引了许多研究注意力。在本文中,我们提出了一种新颖的单弹盒监督实例分割方法,该方法将经典级别设置模型与深度神经网络精致整合在一起。具体而言,我们提出的方法迭代地通过端到端的方式通过基于Chan-Vese的连续能量功能来学习一系列级别集。一个简单的掩码监督的SOLOV2模型可供选择,以预测实例感知的掩码映射为每个实例的级别设置。输入图像及其深度特征都被用作输入数据来发展级别集曲线,其中使用框投影函数来获得初始边界。通过最大程度地减少完全可分化的能量函数,在其相应的边界框注释中迭代优化了每个实例的级别设置。在四个具有挑战性的基准上的实验结果表明,在各种情况下,我们提出的强大实例分割方法的领先表现。该代码可在以下网址获得:https://github.com/liwentomng/boxlevelset。
translated by 谷歌翻译
关于驾驶场景图像的语义细分对于自动驾驶至关重要。尽管在白天图像上已经实现了令人鼓舞的性能,但由于暴露不足和缺乏标记的数据,夜间图像的性能不那么令人满意。为了解决这些问题,我们提出了一个称为双图像自动学习过滤器(拨号过滤器)的附加模块,以改善夜间驾驶条件下的语义分割,旨在利用不同照明下驾驶场景图像的内在特征。拨盘滤波器由两个部分组成,包括图像自适应处理模块(IAPM)和可学习的引导过滤器(LGF)。使用拨号过滤器,我们设计了无监督和有监督的框架,用于夜间驾驶场景细分,可以以端到端的方式进行培训。具体而言,IAPM模块由一个带有一组可区分图像过滤器的小型卷积神经网络组成,可以自适应地增强每个图像,以更好地相对于不同的照明。 LGF用于增强分割网络的输出以获得最终的分割结果。拨号过滤器轻巧有效,可以在白天和夜间图像中轻松应用它们。我们的实验表明,Dail过滤器可以显着改善ACDC_Night和Nightcity数据集的监督细分性能,而它展示了有关无监督的夜间夜间语义细分的最新性能,在黑暗的苏黎世和夜间驾驶测试床上。
translated by 谷歌翻译
本文通过学习的基于零件的自相似性解决了无监督的零件感知点云产生的问题。我们的SPA-VAE可为任何给定物体提供一组潜在的典型候选形状,以及每种此类候选形状的一组刚体转换,以在组装的对象中为一个或多个位置。通过这种方式,可以有效地组合在表面上的嘈杂样品,以估计单腿原型。当原始数据中存在基于零件的自相似性时,以这种方式共享数据会赋予许多优势:建模准确性,适当的自相似生成输出,闭塞的精确填充和模型简约。 Spa-vae是使用各种贝叶斯方法的端到端训练的,该方法使用Gumbel-Softmax Trick进行共享零件分配,并提供各种新颖的损失,以提供适当的电感偏见。对塑料的定量和定性分析证明了SPA-VAE的优势。
translated by 谷歌翻译
大学评估和排名是一个非常复杂的活动。由于世界大学排名日益复杂的指标系统,主要大学正在挣扎。那么我们可以通过简化复杂性找到指标体系的元指标吗?本研究发现了基于可解释机器学习的三个元指标。第一个是时候,成为时间的时间,并相信时间的力量,积累历史沉积物;第二个是空间,成为城市的朋友,并通过合作发展;第三个是关系,成为校友的朋友,争取没有天花板的更多校友捐款。
translated by 谷歌翻译
最近在随机运动预测中的进展,即预测单一过去的姿势序列的多个可能的未来人类动作,导致产生真正不同的未来动作,甚至可以控制一些身体部位的运动。然而,为了实现这一点,最先进的方法需要学习用于多样性的多个映射和用于可控运动预测的专用模型。在本文中,我们向统一的深度生成网络介绍了多种和可控的运动预测。为此,我们利用了现实人类动作的直觉由有效姿势的平滑序列组成,并且给定的有限数据,学习姿势比动作更具易行。因此,我们设计了一种发电机,其顺序地预测不同车身部件的运动,并引入基于流动的基于流动的姿势,以及接合角度损失,以实现运动现实主义。在两个标准基准数据集,人类3.6m和人文集上进行实验。我展示了我们的方法在样本多样性和准确性方面优于最先进的基线。该代码可在https://github.com/wei-mao-2019/gsps获得
translated by 谷歌翻译
我们解决了点云上以对象学习为中心的问题,这对于高级关系推理和可扩展的机器智能至关重要。特别是,我们引入了一个框架spair3d,将3D点云分解为空间混合模型,其中每个组件对应于一个对象。为了建模点云上的空间混合物模型,我们得出了倒角混合物损失,该混合物损失自然适合我们的变异训练管道。此外,我们采用了一个对象规范方案,该方案描述了每个对象相对于其本地素素网元单元的位置。这样的方案允许SPAIR3D使用任意数量的对象建模场景。我们评估了无监督场景分解任务的方法。实验结果表明,SPAIR3D具有强大的可伸缩性,并且能够以无监督的方式从点云中检测和分割未知数的对象。
translated by 谷歌翻译
Automated identification of myocardial scar from late gadolinium enhancement cardiac magnetic resonance images (LGE-CMR) is limited by image noise and artifacts such as those related to motion and partial volume effect. This paper presents a novel joint deep learning (JDL) framework that improves such tasks by utilizing simultaneously learned myocardium segmentations to eliminate negative effects from non-region-of-interest areas. In contrast to previous approaches treating scar detection and myocardium segmentation as separate or parallel tasks, our proposed method introduces a message passing module where the information of myocardium segmentation is directly passed to guide scar detectors. This newly designed network will efficiently exploit joint information from the two related tasks and use all available sources of myocardium segmentation to benefit scar identification. We demonstrate the effectiveness of JDL on LGE-CMR images for automated left ventricular (LV) scar detection, with great potential to improve risk prediction in patients with both ischemic and non-ischemic heart disease and to improve response rates to cardiac resynchronization therapy (CRT) for heart failure patients. Experimental results show that our proposed approach outperforms multiple state-of-the-art methods, including commonly used two-step segmentation-classification networks, and multitask learning schemes where subtasks are indirectly interacted.
translated by 谷歌翻译
The selection of an optimal pacing site, which is ideally scar-free and late activated, is critical to the response of cardiac resynchronization therapy (CRT). Despite the success of current approaches formulating the detection of such late mechanical activation (LMA) regions as a problem of activation time regression, their accuracy remains unsatisfactory, particularly in cases where myocardial scar exists. To address this issue, this paper introduces a multi-task deep learning framework that simultaneously estimates LMA amount and classify the scar-free LMA regions based on cine displacement encoding with stimulated echoes (DENSE) magnetic resonance imaging (MRI). With a newly introduced auxiliary LMA region classification sub-network, our proposed model shows more robustness to the complex pattern cause by myocardial scar, significantly eliminates their negative effects in LMA detection, and in turn improves the performance of scar classification. To evaluate the effectiveness of our method, we tests our model on real cardiac MR images and compare the predicted LMA with the state-of-the-art approaches. It shows that our approach achieves substantially increased accuracy. In addition, we employ the gradient-weighted class activation mapping (Grad-CAM) to visualize the feature maps learned by all methods. Experimental results suggest that our proposed model better recognizes the LMA region pattern.
translated by 谷歌翻译