本文提出了Salenet-端到端卷积神经网络(CNN),用于使用前额叶脑电图(EEG)进行持续注意水平评估。提出了一种偏置驱动的修剪方法,以及小组卷积,全局平均池(GAP),接近零的修剪,重量聚类和模型压缩的量化,达到183.11x的总压缩比。在这项工作中,压缩的分配器在记录的6个受试者EEG数据库上获得了最新的主题无关的持续注意力分类精度为84.2%。该沙发在ARTIX-7 FPGA上实施,竞争功耗为0.11 W,能源效率为8.19 GOPS/W。
translated by 谷歌翻译
心血管疾病是全球死亡的主要原因,是一种与年龄有关的疾病。了解衰老期间心脏的形态和功能变化是一个关键的科学问题,其答案将有助于我们定义心血管疾病的重要危险因素并监测疾病进展。在这项工作中,我们提出了一种新型的条件生成模型,以描述衰老过程中心脏3D解剖学的变化。提出的模型是灵活的,可以将多个临床因素(例如年龄,性别)整合到生成过程中。我们在心脏解剖学的大规模横截面数据集上训练该模型,并在横截面和纵向数据集上进行评估。该模型在预测衰老心脏的纵向演化和对其数据分布进行建模方面表现出了出色的表现。
translated by 谷歌翻译
深层模型的概率校准是在安全至关重要的应用(例如医学成像)中非常可取的。它通过将预测概率与测试数据中的实际准确性对齐,使深网的输出概率可解释。在图像分割中,精心校准的概率使放射科医生可以识别模型预测的分割不可靠的区域。这些不可靠的预测通常是由成像伪影或看不见的成像协议引起的室外(OOD)图像。不幸的是,大多数用于图像分割的先前校准方法在OOD图像上表现出色。为了减少面对OOD图像的校准误差,我们提出了一个新型的事后校准模型。我们的模型利用当地级别的扰动的像素敏感性以及在全球层面的形状先验信息。该模型在心脏MRI分割数据集上进行了测试,这些数据集包含来自看不见的成像协议中看不见的成像伪像和图像。与最新的校准算法相比,我们证明了校准误差减少。
translated by 谷歌翻译
心肌运动和变形是表征心脏功能的丰富描述符。图像注册是心肌运动跟踪最常用的技术,是一个不当的反问题,通常需要先前对解决方案空间进行假设。与大多数现有的方法相反,它们强加了明确的通用正则化(例如平滑度),在这项工作中,我们提出了一种新的方法,该方法可以隐式地学习了特定于应用程序的生物力学知识,并将其嵌入了神经网络参数化转换模型中。尤其是,提出的方法利用基于变异自动编码器的生成模型来学习生物力学上合理变形的多种多样。然后,可以通过穿越学习的歧管来搜索最佳转换时,在考虑序列信息时搜索最佳转换。该方法在三个公共心脏Cine MRI数据集中进行了验证,并具有全面的评估。结果表明,所提出的方法可以胜过其他方法,从而获得更高的运动跟踪精度,并具有合理的量保存和更好地变化数据分布的概括性。它还可以更好地估计心肌菌株,这表明该方法在表征时空特征以理解心血管疾病方面的潜力。
translated by 谷歌翻译
深度学习(DL)模型为各种医学成像基准挑战提供了最先进的性能,包括脑肿瘤细分(BRATS)挑战。然而,局灶性病理多隔室分割(例如,肿瘤和病变子区)的任务特别具有挑战性,并且潜在的错误阻碍DL模型转化为临床工作流程。量化不确定形式的DL模型预测的可靠性,可以实现最不确定的地区的临床审查,从而建立信任并铺平临床翻译。最近,已经引入了许多不确定性估计方法,用于DL医学图像分割任务。开发指标评估和比较不确定性措施的表现将有助于最终用户制定更明智的决策。在本研究中,我们探索并评估在Brats 2019-2020任务期间开发的公制,以对不确定量化量化(Qu-Brats),并旨在评估和排列脑肿瘤多隔室分割的不确定性估计。该公制(1)奖励不确定性估计,对正确断言产生高置信度,以及在不正确的断言处分配低置信水平的估计数,(2)惩罚导致更高百分比的无关正确断言百分比的不确定性措施。我们进一步基准测试由14个独立参与的Qu-Brats 2020的分割不确定性,所有这些都参与了主要的Brats细分任务。总体而言,我们的研究结果证实了不确定性估计提供了分割算法的重要性和互补价值,因此突出了医学图像分析中不确定性量化的需求。我们的评估代码在HTTPS://github.com/ragmeh11/qu-brats公开提供。
translated by 谷歌翻译
神经网络在医疗图像分割任务上的成功通常依赖于大型标记的数据集用于模型培训。但是,由于数据共享和隐私问题,获取和手动标记大型医疗图像集是资源密集的,昂贵的,有时是不切实际的。为了应对这一挑战,我们提出了一个通用的对抗数据增强框架Advchain,旨在提高培训数据对医疗图像分割任务的多样性和有效性。 AdvChain通过动态数据增强来增强数据,从而产生随机链接的光线像和几何转换,以类似于现实而又具有挑战性的成像变化以扩展训练数据。通过在培训期间共同优化数据增强模型和分割网络,可以生成具有挑战性的示例,以增强下游任务的网络可推广性。所提出的对抗数据增强不依赖生成网络,可以用作通用分割网络中的插件模块。它在计算上是有效的,适用于低声监督和半监督学习。我们在两个MR图像分割任务上分析和评估该方法:心脏分割和前列腺分割具有有限的标记数据。结果表明,所提出的方法可以减轻对标记数据的需求,同时提高模型泛化能力,表明其在医学成像应用中的实际价值。
translated by 谷歌翻译
Dynamic Graph Neural Networks (DGNNs) have been broadly applied in various real-life applications, such as link prediction and pandemic forecast, to capture both static structural information and temporal characteristics from dynamic graphs. Combining both time-dependent and -independent components, DGNNs manifest substantial parallel computation and data reuse potentials, but suffer from severe memory access inefficiency and data transfer overhead under the canonical one-graph-at-a-time training pattern. To tackle the challenges, we propose PiPAD, a $\underline{\textbf{Pi}}pelined$ and $\underline{\textbf{PA}}rallel$ $\underline{\textbf{D}}GNN$ training framework for the end-to-end performance optimization on GPUs. From both the algorithm and runtime level, PiPAD holistically reconstructs the overall training paradigm from the data organization to computation manner. Capable of processing multiple graph snapshots in parallel, PiPAD eliminates the unnecessary data transmission and alleviates memory access inefficiency to improve the overall performance. Our evaluation across various datasets shows PiPAD achieves $1.22\times$-$9.57\times$ speedup over the state-of-the-art DGNN frameworks on three representative models.
translated by 谷歌翻译
We propose a novel teacher-student model for semi-supervised multi-organ segmentation. In teacher-student model, data augmentation is usually adopted on unlabeled data to regularize the consistent training between teacher and student. We start from a key perspective that fixed relative locations and variable sizes of different organs can provide distribution information where a multi-organ CT scan is drawn. Thus, we treat the prior anatomy as a strong tool to guide the data augmentation and reduce the mismatch between labeled and unlabeled images for semi-supervised learning. More specifically, we propose a data augmentation strategy based on partition-and-recovery N$^3$ cubes cross- and within- labeled and unlabeled images. Our strategy encourages unlabeled images to learn organ semantics in relative locations from the labeled images (cross-branch) and enhances the learning ability for small organs (within-branch). For within-branch, we further propose to refine the quality of pseudo labels by blending the learned representations from small cubes to incorporate local attributes. Our method is termed as MagicNet, since it treats the CT volume as a magic-cube and $N^3$-cube partition-and-recovery process matches with the rule of playing a magic-cube. Extensive experiments on two public CT multi-organ datasets demonstrate the effectiveness of MagicNet, and noticeably outperforms state-of-the-art semi-supervised medical image segmentation approaches, with +7% DSC improvement on MACT dataset with 10% labeled images.
translated by 谷歌翻译
The task of referring video object segmentation aims to segment the object in the frames of a given video to which the referring expressions refer. Previous methods adopt multi-stage approach and design complex pipelines to obtain promising results. Recently, the end-to-end method based on Transformer has proved its superiority. In this work, we draw on the advantages of the above methods to provide a simple and effective pipeline for RVOS. Firstly, We improve the state-of-the-art one-stage method ReferFormer to obtain mask sequences that are strongly correlated with language descriptions. Secondly, based on a reliable and high-quality keyframe, we leverage the superior performance of video object segmentation model to further enhance the quality and temporal consistency of the mask results. Our single model reaches 70.3 J &F on the Referring Youtube-VOS validation set and 63.0 on the test set. After ensemble, we achieve 64.1 on the final leaderboard, ranking 1st place on CVPR2022 Referring Youtube-VOS challenge. Code will be available at https://github.com/Zhiweihhh/cvpr2022-rvos-challenge.git.
translated by 谷歌翻译
Referring image segmentation aims to segment the target object described by a given natural language expression. Typically, referring expressions contain complex relationships between the target and its surrounding objects. The main challenge of this task is to understand the visual and linguistic content simultaneously and to find the referred object accurately among all instances in the image. Currently, the most effective way to solve the above problem is to obtain aligned multi-modal features by computing the correlation between visual and linguistic feature modalities under the supervision of the ground-truth mask. However, existing paradigms have difficulty in thoroughly understanding visual and linguistic content due to the inability to perceive information directly about surrounding objects that refer to the target. This prevents them from learning aligned multi-modal features, which leads to inaccurate segmentation. To address this issue, we present a position-aware contrastive alignment network (PCAN) to enhance the alignment of multi-modal features by guiding the interaction between vision and language through prior position information. Our PCAN consists of two modules: 1) Position Aware Module (PAM), which provides position information of all objects related to natural language descriptions, and 2) Contrastive Language Understanding Module (CLUM), which enhances multi-modal alignment by comparing the features of the referred object with those of related objects. Extensive experiments on three benchmarks demonstrate our PCAN performs favorably against the state-of-the-art methods. Our code will be made publicly available.
translated by 谷歌翻译