U-NET一直是医疗图像分割任务的首选架构,但是将U-NET体系结构扩展到3D图像时会出现计算挑战。我们提出了隐式U-NET体系结构,该体系结构将有效的隐式表示范式适应监督的图像分割任务。通过将卷积特征提取器与隐式定位网络相结合,我们隐式U-NET的参数比等效的U-NET少40%。此外,我们提出了培训和推理程序,以利用稀疏的预测。与等效的完全卷积U-NET相比,隐式U-NET减少了约30%的推理和训练时间以及训练记忆足迹,同时在我们的两个不同的腹部CT扫描数据集中取得了可比的结果。
translated by 谷歌翻译
医学成像深度学习模型通常是大而复杂的,需要专门的硬件来训练和评估这些模型。为了解决此类问题,我们提出了PocketNet范式,以减少深度学习模型的规模,通过促进卷积神经网络中的渠道数量的增长。我们证明,对于一系列的分割和分类任务,PocketNet架构产生的结果与常规神经网络相当,同时将参数数量减少多个数量级,最多使用90%的GPU记忆,并加快训练时间的加快。高达40%,从而允许在资源约束设置中培训和部署此类模型。
translated by 谷歌翻译
肾脏结构细分是计算机辅助诊断基于手术的肾癌的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功,但由于肾脏肿瘤的尺寸可变,肾脏肿瘤及其周围环境之间的歧义范围可变,因此对计算机层析造影血管造影(CTA)图像的肾脏结构的准确分割仍然具有挑战性。 。在本文中,我们在CTA扫描中提出了一个边界感知网络(BA-NET),以分段肾脏,肾脏肿瘤,动脉和静脉。该模型包含共享编码器,边界解码器和分割解码器。两个解码器都采用了多尺度的深度监督策略,这可以减轻肿瘤大小可变的问题。边界解码器在每个量表上产生的边界概率图被用作提高分割特征图的注意。我们在肾脏解析(KIPA)挑战数据集上评估了BA-NET,并通过使用4倍的交叉验证来实现CTA扫描的肾脏结构细分的平均骰子得分为89.65 $ \%$。结果证明了BA-NET的有效性。
translated by 谷歌翻译
Fully Convolutional Neural Networks (FCNNs) with contracting and expanding paths have shown prominence for the majority of medical image segmentation applications since the past decade. In FCNNs, the encoder plays an integral role by learning both global and local features and contextual representations which can be utilized for semantic output prediction by the decoder. Despite their success, the locality of convolutional layers in FCNNs, limits the capability of learning long-range spatial dependencies. Inspired by the recent success of transformers for Natural Language Processing (NLP) in long-range sequence learning, we reformulate the task of volumetric (3D) medical image segmentation as a sequence-to-sequence prediction problem. We introduce a novel architecture, dubbed as UNEt TRansformers (UNETR), that utilizes a transformer as the encoder to learn sequence representations of the input volume and effectively capture the global multi-scale information, while also following the successful "U-shaped" network design for the encoder and decoder. The transformer encoder is directly connected to a decoder via skip connections at different resolutions to compute the final semantic segmentation output. We have validated the performance of our method on the Multi Atlas Labeling Beyond The Cranial Vault (BTCV) dataset for multiorgan segmentation and the Medical Segmentation Decathlon (MSD) dataset for brain tumor and spleen segmentation tasks. Our benchmarks demonstrate new state-of-the-art performance on the BTCV leaderboard. Code: https://monai.io/research/unetr
translated by 谷歌翻译
计算机断层扫描(CT)图像中腹部器官的自动分割可以支持放射治疗和图像引导的手术工作流程。这种自动解决方案的开发仍然挑战,主要是由于CT图像中的复杂器官相互作用和模糊边界。为了解决这些问题,我们专注于有效的空间上下文建模和显式边缘分段前提。因此,我们提出了一个3D网络,其中四个主要组件训练了端到端,包括共享编码器,边缘检测器,具有边缘跳过连接的解码器(ESC)和复制特征传播头(RFP-head)。为了捕获宽范围的空间依赖性,RFP-磁头通过以有效的切片方式配制的定向非循环图(DAG)传播和收集局部特征,以高效的切片方式,关于图像单元的空间排列。为了利用边缘信息,边缘探测器通过利用边缘监控来学习专门针对语义分割专门调整的边缘知识。然后,ESC通过多级解码器特征聚合边缘知识,以学习判别特征的层次结构明确地建模器官内部和边缘之间的互补性进行分割。我们对具有八个带电器官的两个挑战性腹部CT数据集进行了广泛的实验。实验结果表明,所提出的网络优于几种最先进的模型,特别是对于小而复杂的结构(胆囊,食道,胃,胰腺和十二指肠)的分割。该代码将公开。
translated by 谷歌翻译
自动化的腹部多器官分割是计算机辅助诊断腹部器官相关疾病的至关重要但具有挑战性的任务。尽管许多深度学习模型在许多医学图像分割任务中取得了显着的成功,但由于腹部器官的不同大小以及它们之间的含糊界限,腹部器官的准确分割仍然具有挑战性。在本文中,我们提出了一个边界感知网络(BA-NET),以分段CT扫描和MRI扫描进行腹部器官。该模型包含共享编码器,边界解码器和分割解码器。两个解码器都采用了多尺度的深度监督策略,这可以减轻可变器官尺寸引起的问题。边界解码器在每个量表上产生的边界概率图被用作提高分割特征图的注意。我们评估了腹部多器官细分(AMOS)挑战数据集的BA-NET,并获得了CT扫描的多器官分割的平均骰子分数为89.29 $ \%$,平均骰子得分为71.92 $ \%$ \%$ \% MRI扫描。结果表明,在两个分割任务上,BA-NET优于NNUNET。
translated by 谷歌翻译
医学成像的病变分割是临床研究中的一个重要课题。研究人员提出了各种检测和分段算法来解决这项任务。最近,基于深度学习的方法显着提高了传统方法的性能。然而,大多数最先进的深度学习方法需要手动设计多个网络组件和培训策略。在本文中,我们提出了一种新的自动化机器学习算法T-Automl,不仅搜索最佳神经结构,而且还可以同时找到超参数和数据增强策略的最佳组合。该方法采用现代变压器模型,引入了适应搜索空间嵌入的动态长度,并且可以显着提高搜索能力。我们在几个大型公共病变分割数据集上验证T-Automl并实现最先进的性能。
translated by 谷歌翻译
我们为Brats21挑战中的脑肿瘤分割任务提出了优化的U-Net架构。为了找到最佳模型架构和学习时间表,我们运行了一个广泛的消融研究来测试:深度监督损失,焦点,解码器注意,下降块和残余连接。此外,我们搜索了U-Net编码器的最佳深度,卷积通道数量和后处理策略。我们的方法赢得了验证阶段,并在测试阶段进行了第三位。我们已开放源代码以在NVIDIA深度学习示例GitHub存储库中重现我们的Brats21提交。
translated by 谷歌翻译
Convolutional Neural Networks (CNNs) have been recently employed to solve problems from both the computer vision and medical image analysis fields. Despite their popularity, most approaches are only able to process 2D images while most medical data used in clinical practice consists of 3D volumes. In this work we propose an approach to 3D image segmentation based on a volumetric, fully convolutional, neural network. Our CNN is trained end-to-end on MRI volumes depicting prostate, and learns to predict segmentation for the whole volume at once. We introduce a novel objective function, that we optimise during training, based on Dice coefficient. In this way we can deal with situations where there is a strong imbalance between the number of foreground and background voxels. To cope with the limited number of annotated volumes available for training, we augment the data applying random non-linear transformations and histogram matching. We show in our experimental evaluation that our approach achieves good performances on challenging test data while requiring only a fraction of the processing time needed by other previous methods.
translated by 谷歌翻译
视觉变形金刚(VIT)S表现出可观的全球和本地陈述的自我监督学习表现,可以转移到下游应用程序。灵感来自这些结果,我们介绍了一种新的自我监督学习框架,具有用于医学图像分析的定制代理任务。具体而言,我们提出:(i)以新的3D变压器为基础的型号,被称为往返变压器(Swin Unet),具有分层编码器,用于自我监督的预训练; (ii)用于学习人类解剖学潜在模式的定制代理任务。我们展示了来自各种身体器官的5,050个公共可用的计算机断层扫描(CT)图像的提出模型的成功预培训。通过微调超出颅穹窿(BTCV)分割挑战的预先调整训练模型和来自医疗细分牌组(MSD)数据集的分割任务,通过微调训练有素的模型来验证我们的方法的有效性。我们的模型目前是MSD和BTCV数据集的公共测试排行榜上的最先进的(即第1号)。代码:https://monai.io/research/swin-unetr.
translated by 谷歌翻译
我们实施了两个不同的三维深度学习神经网络,并评估了它们在非对比度计算机断层扫描(CT)上看到的颅内出血(ICH)的能力。一种模型,称为“沿正交关注u-net沿正交级别的素隔离”(Viola-Unet),其体系结构元素可适应2022年实例的数据挑战。第二个比较模型是从No-New U-NET(NNU-NET)得出的。输入图像和地面真理分割图用于以监督方式分别训练两个网络。验证数据随后用于半监督培训。在5倍交叉验证期间比较了模型预测。中提琴 - UNET的表现优于四个性能指标中的两个(即NSD和RVD)的比较网络。将中提琴和NNU-NET网络组合的合奏模型在DSC和HD方面的性能最高。我们证明,与3D U-NET相关的ICH分割性能优势有效地合并了U-NET的解码分支期间的空间正交特征。 Viola-Unet AI工具的代码基础,预估计的权重和Docker图像将在https://github.com/samleoqh/viola-unet上公开获得。
translated by 谷歌翻译
Brain tumor imaging has been part of the clinical routine for many years to perform non-invasive detection and grading of tumors. Tumor segmentation is a crucial step for managing primary brain tumors because it allows a volumetric analysis to have a longitudinal follow-up of tumor growth or shrinkage to monitor disease progression and therapy response. In addition, it facilitates further quantitative analysis such as radiomics. Deep learning models, in particular CNNs, have been a methodology of choice in many applications of medical image analysis including brain tumor segmentation. In this study, we investigated the main design aspects of CNN models for the specific task of MRI-based brain tumor segmentation. Two commonly used CNN architectures (i.e. DeepMedic and U-Net) were used to evaluate the impact of the essential parameters such as learning rate, batch size, loss function, and optimizer. The performance of CNN models using different configurations was assessed with the BraTS 2018 dataset to determine the most performant model. Then, the generalization ability of the model was assessed using our in-house dataset. For all experiments, U-Net achieved a higher DSC compared to the DeepMedic. However, the difference was only statistically significant for whole tumor segmentation using FLAIR sequence data and tumor core segmentation using T1w sequence data. Adam and SGD both with the initial learning rate set to 0.001 provided the highest segmentation DSC when training the CNN model using U-Net and DeepMedic architectures, respectively. No significant difference was observed when using different normalization approaches. In terms of loss functions, a weighted combination of soft Dice and cross-entropy loss with the weighting term set to 0.5 resulted in an improved segmentation performance and training stability for both DeepMedic and U-Net models.
translated by 谷歌翻译
Owing to the success of transformer models, recent works study their applicability in 3D medical segmentation tasks. Within the transformer models, the self-attention mechanism is one of the main building blocks that strives to capture long-range dependencies, compared to the local convolutional-based design. However, the self-attention operation has quadratic complexity which proves to be a computational bottleneck, especially in volumetric medical imaging, where the inputs are 3D with numerous slices. In this paper, we propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters and compute cost. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features using a pair of inter-dependent branches based on spatial and channel attention. Our spatial attention formulation is efficient having linear complexity with respect to the input sequence length. To enable communication between spatial and channel-focused branches, we share the weights of query and key mapping functions that provide a complimentary benefit (paired attention), while also reducing the overall network parameters. Our extensive evaluations on three benchmarks, Synapse, BTCV and ACDC, reveal the effectiveness of the proposed contributions in terms of both efficiency and accuracy. On Synapse dataset, our UNETR++ sets a new state-of-the-art with a Dice Similarity Score of 87.2%, while being significantly efficient with a reduction of over 71% in terms of both parameters and FLOPs, compared to the best existing method in the literature. Code: https://github.com/Amshaker/unetr_plus_plus.
translated by 谷歌翻译
卷积神经网络(CNNS)在3D医学图像上自动分割器官或病变取得了显着的成功。最近,视觉变压器网络在2D图像分类任务中表现出卓越的性能。与CNN相比,变压器网络由于其自我关注算法而提取远程特征的吸引力。因此,我们提出了一种称为Bitr-UNET的CNN变压器组合模型,对多模态MRI扫描进行脑肿瘤分割的具体修饰。我们的Bitr-UNET在BRATS2021验证数据集中实现了良好的性能,中值骰子得分0.9335,0.9304和0.8899,以及整个肿瘤,肿瘤核心和增强肿瘤的中位Hausdorff距离2.8284,2.2361和1.4142。在BRATS2021测试数据集上,骰子评分的相应结果为0.9257,0.9350和0.8874,对于Hausdorff距离为3,2.2361和1.4142。该代码在https://github.com/justatinydot/bitr-unet上公开使用。
translated by 谷歌翻译
Data scarcity is common in deep learning models for medical image segmentation. Previous works proposed multi-dataset learning, either simultaneously or via transfer learning to expand training sets. However, medical image datasets have diverse-sized images and features, and developing a model simultaneously for multiple datasets is challenging. This work proposes Fabric Image Representation Encoding Network (FIRENet), a universal architecture for simultaneous multi-dataset segmentation and transfer learning involving arbitrary numbers of dataset(s). To handle different-sized image and feature, a 3D fabric module is used to encapsulate many multi-scale sub-architectures. An optimal combination of these sub-architectures can be implicitly learnt to best suit the target dataset(s). For diverse-scale feature extraction, a 3D extension of atrous spatial pyramid pooling (ASPP3D) is used in each fabric node for a fine-grained coverage of rich-scale image features. In the first experiment, FIRENet performed 3D universal bone segmentation of multiple musculoskeletal datasets of the human knee, shoulder and hip joints and exhibited excellent simultaneous multi-dataset segmentation performance. When tested for transfer learning, FIRENet further exhibited excellent single dataset performance (when pre-training on a prostate dataset), as well as significantly improved universal bone segmentation performance. The following experiment involves the simultaneous segmentation of the 10 Medical Segmentation Decathlon (MSD) challenge datasets. FIRENet demonstrated good multi-dataset segmentation results and inter-dataset adaptability of highly diverse image sizes. In both experiments, FIRENet's streamlined multi-dataset learning with one unified network that requires no hyper-parameter tuning.
translated by 谷歌翻译
在过去的十年中,卷积神经网络(Convnets)主导了医学图像分析领域。然而,发现脉搏的性能仍然可以受到它们无法模拟图像中体素之间的远程空间关系的限制。最近提出了众多视力变压器来解决哀悼缺点,在许多医学成像应用中展示最先进的表演。变压器可以是用于图像配准的强烈候选者,因为它们的自我注意机制能够更精确地理解移动和固定图像之间的空间对应。在本文中,我们呈现透射帧,一个用于体积医学图像配准的混合变压器-Cromnet模型。我们还介绍了三种变速器的变形,具有两个散晶变体,确保了拓扑保存的变形和产生良好校准的登记不确定性估计的贝叶斯变体。使用来自两个应用的体积医学图像的各种现有的登记方法和变压器架构进行广泛验证所提出的模型:患者间脑MRI注册和幻影到CT注册。定性和定量结果表明,传输和其变体导致基线方法的实质性改进,展示了用于医学图像配准的变压器的有效性。
translated by 谷歌翻译
Achieving accurate and automated tumor segmentation plays an important role in both clinical practice and radiomics research. Segmentation in medicine is now often performed manually by experts, which is a laborious, expensive and error-prone task. Manual annotation relies heavily on the experience and knowledge of these experts. In addition, there is much intra- and interobserver variation. Therefore, it is of great significance to develop a method that can automatically segment tumor target regions. In this paper, we propose a deep learning segmentation method based on multimodal positron emission tomography-computed tomography (PET-CT), which combines the high sensitivity of PET and the precise anatomical information of CT. We design an improved spatial attention network(ISA-Net) to increase the accuracy of PET or CT in detecting tumors, which uses multi-scale convolution operation to extract feature information and can highlight the tumor region location information and suppress the non-tumor region location information. In addition, our network uses dual-channel inputs in the coding stage and fuses them in the decoding stage, which can take advantage of the differences and complementarities between PET and CT. We validated the proposed ISA-Net method on two clinical datasets, a soft tissue sarcoma(STS) and a head and neck tumor(HECKTOR) dataset, and compared with other attention methods for tumor segmentation. The DSC score of 0.8378 on STS dataset and 0.8076 on HECKTOR dataset show that ISA-Net method achieves better segmentation performance and has better generalization. Conclusions: The method proposed in this paper is based on multi-modal medical image tumor segmentation, which can effectively utilize the difference and complementarity of different modes. The method can also be applied to other multi-modal data or single-modal data by proper adjustment.
translated by 谷歌翻译
深度学习已被广​​泛用于医学图像分割,并且录制了录制了该领域深度学习的成功的大量论文。在本文中,我们使用深层学习技术对医学图像分割的全面主题调查。本文进行了两个原创贡献。首先,与传统调查相比,直接将深度学习的文献分成医学图像分割的文学,并为每组详细介绍了文献,我们根据从粗略到精细的多级结构分类目前流行的文献。其次,本文侧重于监督和弱监督的学习方法,而不包括无监督的方法,因为它们在许多旧调查中引入而且他们目前不受欢迎。对于监督学习方法,我们分析了三个方面的文献:骨干网络的选择,网络块的设计,以及损耗功能的改进。对于虚弱的学习方法,我们根据数据增强,转移学习和交互式分割进行调查文献。与现有调查相比,本调查将文献分类为比例不同,更方便读者了解相关理由,并将引导他们基于深度学习方法思考医学图像分割的适当改进。
translated by 谷歌翻译
机器学习和计算机视觉技术近年来由于其自动化,适合性和产生惊人结果的能力而迅速发展。因此,在本文中,我们调查了2014年至2022年之间发表的关键研究,展示了不同的机器学习算法研究人员用来分割肝脏,肝肿瘤和肝脉管结构的研究。我们根据感兴趣的组织(肝果,肝肿瘤或肝毒剂)对被调查的研究进行了划分,强调了同时解决多个任务的研究。此外,机器学习算法被归类为受监督或无监督的,如果属于某个方案的工作量很大,则将进一步分区。此外,对文献和包含上述组织面具的网站发现的不同数据集和挑战进行了彻底讨论,强调了组织者的原始贡献和其他研究人员的贡献。同样,在我们的评论中提到了文献中过度使用的指标,这强调了它们与手头的任务的相关性。最后,强调创新研究人员应对需要解决的差距的关键挑战和未来的方向,例如许多关于船舶分割挑战的研究的稀缺性以及为什么需要早日处理他们的缺席。
translated by 谷歌翻译
Recently, implicit neural representations have gained popularity for learning-based 3D reconstruction. While demonstrating promising results, most implicit approaches are limited to comparably simple geometry of single objects and do not scale to more complicated or large-scale scenes. The key limiting factor of implicit methods is their simple fullyconnected network architecture which does not allow for integrating local information in the observations or incorporating inductive biases such as translational equivariance. In this paper, we propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes. By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space. We investigate the effectiveness of the proposed representation by reconstructing complex geometry from noisy point clouds and low-resolution voxel representations. We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.
translated by 谷歌翻译