Quantifying the perceptual similarity of two images is a long-standing problem in low-level computer vision. The natural image domain commonly relies on supervised learning, e.g., a pre-trained VGG, to obtain a latent representation. However, due to domain shift, pre-trained models from the natural image domain might not apply to other image domains, such as medical imaging. Notably, in medical imaging, evaluating the perceptual similarity is exclusively performed by specialists trained extensively in diverse medical fields. Thus, medical imaging remains devoid of task-specific, objective perceptual measures. This work answers the question: Is it necessary to rely on supervised learning to obtain an effective representation that could measure perceptual similarity, or is self-supervision sufficient? To understand whether recent contrastive self-supervised representation (CSR) may come to the rescue, we start with natural images and systematically evaluate CSR as a metric across numerous contemporary architectures and tasks and compare them with existing methods. We find that in the natural image domain, CSR behaves on par with the supervised one on several perceptual tests as a metric, and in the medical domain, CSR better quantifies perceptual similarity concerning the experts' ratings. We also demonstrate that CSR can significantly improve image quality in two image synthesis tasks. Finally, our extensive results suggest that perceptuality is an emergent property of CSR, which can be adapted to many image domains without requiring annotations.
translated by 谷歌翻译
The existence of completely aligned and paired multi-modal neuroimaging data has proved its effectiveness in diagnosis of brain diseases. However, collecting the full set of well-aligned and paired data is expensive or even impractical, since the practical difficulties may include high cost, long time acquisition, image corruption, and privacy issues. A realistic solution is to explore either an unsupervised learning or a semi-supervised learning to synthesize the absent neuroimaging data. In this paper, we are the first one to comprehensively approach cross-modality neuroimage synthesis task from different perspectives, which include the level of the supervision (especially for weakly-supervised and unsupervised), loss function, evaluation metrics, the range of modality synthesis, datasets (aligned, private and public) and the synthesis-based downstream tasks. To begin with, we highlight several opening challenges for cross-modality neuroimage sysnthesis. Then we summarize the architecture of cross-modality synthesis under various of supervision level. In addition, we provide in-depth analysis of how cross-modality neuroimage synthesis can improve the performance of different downstream tasks. Finally, we re-evaluate the open challenges and point out the future directions for the remaining challenges. All resources are available at https://github.com/M-3LAB/awesome-multimodal-brain-image-systhesis
translated by 谷歌翻译
Segmenting the fine structure of the mouse brain on magnetic resonance (MR) images is critical for delineating morphological regions, analyzing brain function, and understanding their relationships. Compared to a single MRI modality, multimodal MRI data provide complementary tissue features that can be exploited by deep learning models, resulting in better segmentation results. However, multimodal mouse brain MRI data is often lacking, making automatic segmentation of mouse brain fine structure a very challenging task. To address this issue, it is necessary to fuse multimodal MRI data to produce distinguished contrasts in different brain structures. Hence, we propose a novel disentangled and contrastive GAN-based framework, named MouseGAN++, to synthesize multiple MR modalities from single ones in a structure-preserving manner, thus improving the segmentation performance by imputing missing modalities and multi-modality fusion. Our results demonstrate that the translation performance of our method outperforms the state-of-the-art methods. Using the subsequently learned modality-invariant information as well as the modality-translated images, MouseGAN++ can segment fine brain structures with averaged dice coefficients of 90.0% (T2w) and 87.9% (T1w), respectively, achieving around +10% performance improvement compared to the state-of-the-art algorithms. Our results demonstrate that MouseGAN++, as a simultaneous image synthesis and segmentation method, can be used to fuse cross-modality information in an unpaired manner and yield more robust performance in the absence of multimodal data. We release our method as a mouse brain structural segmentation tool for free academic usage at https://github.com/yu02019.
translated by 谷歌翻译
\ textit {objection:}基于gadolinium的对比剂(GBCA)已被广泛用于更好地可视化脑磁共振成像中的疾病(MRI)。然而,大脑和身体内部的gadolin量引起了人们对使用GBCA的安全问题。因此,在提供类似的对比度信息的同时,可以减少甚至消除GBCA暴露的新方法的发展将在临床上具有重大用途。 \ textit {方法:}在这项工作中,我们提出了一种基于深度学习的方法,用于对脑肿瘤患者的对比增强T1合成。 3D高分辨率完全卷积网络(FCN)通过处理和聚合并行的多尺度信息保持高分辨率信息,旨在将前对比度MRI序列映射到对比度增强的MRI序列。具体而言,将三个前对比的MRI序列T1,T2和表观扩散系数图(ADC)用作输入,而对比后T1序列则被用作目标输出。为了减轻正常组织与肿瘤区域之间的数据不平衡问题,我们引入了局部损失,以改善肿瘤区域的贡献,从而可以更好地增强对肿瘤的增强结果。 \ textIt {结果:}进行了广泛的定量和视觉评估,我们提出的模型在大脑中达到28.24db的PSNR,在肿瘤区域达到21.2db。 \ textit {结论和意义:}我们的结果表明,用深度学习产生的合成对比图像代替GBCA的潜力。代码可在\ url {https://github.com/chenchao666/contrast-enhanced-mri-synthesis中获得
translated by 谷歌翻译
Cross-modality magnetic resonance (MR) image synthesis aims to produce missing modalities from existing ones. Currently, several methods based on deep neural networks have been developed using both source- and target-modalities in a supervised learning manner. However, it remains challenging to obtain a large amount of completely paired multi-modal training data, which inhibits the effectiveness of existing methods. In this paper, we propose a novel Self-supervised Learning-based Multi-scale Transformer Network (SLMT-Net) for cross-modality MR image synthesis, consisting of two stages, \ie, a pre-training stage and a fine-tuning stage. During the pre-training stage, we propose an Edge-preserving Masked AutoEncoder (Edge-MAE), which preserves the contextual and edge information by simultaneously conducting the image reconstruction and the edge generation. Besides, a patch-wise loss is proposed to treat the input patches differently regarding their reconstruction difficulty, by measuring the difference between the reconstructed image and the ground-truth. In this case, our Edge-MAE can fully leverage a large amount of unpaired multi-modal data to learn effective feature representations. During the fine-tuning stage, we present a Multi-scale Transformer U-Net (MT-UNet) to synthesize the target-modality images, in which a Dual-scale Selective Fusion (DSF) module is proposed to fully integrate multi-scale features extracted from the encoder of the pre-trained Edge-MAE. Moreover, we use the pre-trained encoder as a feature consistency module to measure the difference between high-level features of the synthesized image and the ground truth one. Experimental results show the effectiveness of the proposed SLMT-Net, and our model can reliably synthesize high-quality images when the training set is partially unpaired. Our code will be publicly available at https://github.com/lyhkevin/SLMT-Net.
translated by 谷歌翻译
Magnetic Resonance Fingerprinting (MRF) is an efficient quantitative MRI technique that can extract important tissue and system parameters such as T1, T2, B0, and B1 from a single scan. This property also makes it attractive for retrospectively synthesizing contrast-weighted images. In general, contrast-weighted images like T1-weighted, T2-weighted, etc., can be synthesized directly from parameter maps through spin-dynamics simulation (i.e., Bloch or Extended Phase Graph models). However, these approaches often exhibit artifacts due to imperfections in the mapping, the sequence modeling, and the data acquisition. Here we propose a supervised learning-based method that directly synthesizes contrast-weighted images from the MRF data without going through the quantitative mapping and spin-dynamics simulation. To implement our direct contrast synthesis (DCS) method, we deploy a conditional Generative Adversarial Network (GAN) framework and propose a multi-branch U-Net as the generator. The input MRF data are used to directly synthesize T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) images through supervised training on paired MRF and target spin echo-based contrast-weighted scans. In-vivo experiments demonstrate excellent image quality compared to simulation-based contrast synthesis and previous DCS methods, both visually as well as by quantitative metrics. We also demonstrate cases where our trained model is able to mitigate in-flow and spiral off-resonance artifacts that are typically seen in MRF reconstructions and thus more faithfully represent conventional spin echo-based contrast-weighted images.
translated by 谷歌翻译
使用卷积神经网络(CNN)的最先进的磁共振(MR)图像超分辨率方法(ISR)由于CNN的空间覆盖率有限,因此在有限的上下文信息中利用有限的上下文信息。Vision Transformers(VIT)学习更好的全球环境,这有助于产生优质的HR图像。我们将CNN的本地信息和来自VIT的全局信息结合在一起,以获得图像超级分辨率和输出超级分辨率的图像,这些图像的质量比最先进的方法所产生的质量更高。我们通过多个新颖的损失函数包括额外的约束,这些损失功能将结构和纹理信息从低分辨率到高分辨率图像。
translated by 谷歌翻译
创伤性脑损伤(TBI)患者的脑网络分析对于其意识水平评估和预后评估至关重要,这需要分割某些意识相关的大脑区域。但是,由于很难收集TBI患者的手动注释的MR扫描,因此很难构建TBI分割模型。数据增强技术可用于缓解数据稀缺问题。但是,常规数据增强策略(例如空间和强度转化)无法模仿创伤性大脑中的变形和病变,这限制了后续分割任务的性能。为了解决这些问题,我们提出了一种名为TBIGA的新型医学图像授课模型,以通过配对的脑标签图合成TBI MR扫描。我们的TBIGAN方法的主要优势在于,它可以同时生成TBI图像和相应的标签映射,这在以前的医学图像的先前涂上方法中尚未实现。我们首先按照粗到细节的方式在边缘信息的指导下生成成分的图像,然后将合成强度图像用作标签上填充的先验。此外,我们引入了基于注册的模板增强管道,以增加合成图像对的多样性并增强数据增强能力。实验结果表明,提出的TBIGAN方法可以产生具有高质量和有效标签图的足够合成的TBI图像,这可以大大改善与替代方案相比的2D和3D创伤性脑部分割性能。
translated by 谷歌翻译
高质量注释的医学成像数据集的稀缺性是一个主要问题,它与医学成像分析领域的机器学习应用相撞并阻碍了其进步。自我监督学习是一种最近的培训范式,可以使学习强大的表示无需人类注释,这可以被视为有效的解决方案,以解决带注释的医学数据的稀缺性。本文回顾了自我监督学习方法的最新研究方向,用于图像数据,并将其专注于其在医学成像分析领域的应用。本文涵盖了从计算机视野领域的最新自我监督学习方法,因为它们适用于医学成像分析,并将其归类为预测性,生成性和对比性方法。此外,该文章涵盖了40个在医学成像分析中自学学习领域的最新研究论文,旨在阐明该领域的最新创新。最后,本文以该领域的未来研究指示结束。
translated by 谷歌翻译
最近,蒙面图像建模(MIM)由于其能力从大量未标记的数据中学习而引起了人们的关注,并且已被证明对涉及自然图像的各种视觉任务有效。同时,由于未标记的图像的数量高,预计3D医学图像中的自我监督学习的潜力预计将是巨大的,以及质量标签的费用和困难。但是,MIM对医学图像的适用性仍然不确定。在本文中,我们证明了掩盖的图像建模方法还可以推进3D医学图像分析,除了自然图像。我们研究掩盖图像建模策略如何从3D医学图像分割的角度利用性能作为代表性的下游任务:i)与天真的对比度学习相比,蒙版的图像建模方法可以加快监督培训的收敛性,甚至更快(1.40美元$ \ times $ \ times $ $ $ )并最终产生更高的骰子分数; ii)预测具有较高掩盖比和相对较小的贴片大小的原始体素值是用于医学图像建模的非平凡的自我监督借口任务; iii)重建的轻质解码器或投影头设计对于3D医学图像上的掩盖图像建模非常有力,该图像加快了训练并降低成本; iv)最后,我们还研究了在不同的实际情况下使用不同图像分辨率和标记的数据比率的MIM方法的有效性。
translated by 谷歌翻译
数据采集​​和注释中的困难基本上限制了3D医学成像应用的训练数据集的样本尺寸。结果,在没有足够的预训练参数的情况下,构建来自划痕的高性能3D卷积神经网络仍然是一项艰巨的任务。以前关于3D预培训的努力经常依赖于自我监督的方法,它在未标记的数据上使用预测或对比学习来构建不变的3D表示。然而,由于大规模监督信息的不可用,从这些学习框架获得语义不变和歧视性表示仍然存在问题。在本文中,我们重新审视了一种创新但简单的完全监督的3D网络预训练框架,以利用来自大型2D自然图像数据集的语义监督。通过重新设计的3D网络架构,重新设计的自然图像用于解决数据稀缺问题并开发强大的3D表示。四个基准数据集上的综合实验表明,所提出的预先接受的模型可以有效地加速收敛,同时还提高了各种3D医学成像任务,例如分类,分割和检测的准确性。此外,与从头划伤的训练相比,它可以节省高达60%的注释工作。在NIH Deeplesion数据集上,它同样地实现了最先进的检测性能,优于早期的自我监督和完全监督的预训练方法,以及从头训练进行培训的方法。为了促进3D医疗模型的进一步发展,我们的代码和预先接受的模型权重在https://github.com/urmagicsmine/cspr上公开使用。
translated by 谷歌翻译
Self-supervised image denoising techniques emerged as convenient methods that allow training denoising models without requiring ground-truth noise-free data. Existing methods usually optimize loss metrics that are calculated from multiple noisy realizations of similar images, e.g., from neighboring tomographic slices. However, those approaches fail to utilize the multiple contrasts that are routinely acquired in medical imaging modalities like MRI or dual-energy CT. In this work, we propose the new self-supervised training scheme Noise2Contrast that combines information from multiple measured image contrasts to train a denoising model. We stack denoising with domain-transfer operators to utilize the independent noise realizations of different image contrasts to derive a self-supervised loss. The trained denoising operator achieves convincing quantitative and qualitative results, outperforming state-of-the-art self-supervised methods by 4.7-11.0%/4.8-7.3% (PSNR/SSIM) on brain MRI data and by 43.6-50.5%/57.1-77.1% (PSNR/SSIM) on dual-energy CT X-ray microscopy data with respect to the noisy baseline. Our experiments on different real measured data sets indicate that Noise2Contrast training generalizes to other multi-contrast imaging modalities.
translated by 谷歌翻译
这项工作提出了一个新颖的框架CISFA(对比图像合成和自我监督的特征适应),该框架建立在图像域翻译和无监督的特征适应性上,以进行跨模式生物医学图像分割。与现有作品不同,我们使用单方面的生成模型,并在输入图像的采样贴片和相应的合成图像之间添加加权贴片对比度损失,该图像用作形状约束。此外,我们注意到生成的图像和输入图像共享相似的结构信息,但具有不同的方式。因此,我们在生成的图像和输入图像上强制实施对比损失,以训练分割模型的编码器,以最大程度地减少学到的嵌入空间中成对图像之间的差异。与依靠对抗性学习进行特征适应的现有作品相比,这种方法使编码器能够以更明确的方式学习独立于域的功能。我们对包含腹腔和全心的CT和MRI图像的分割任务进行了广泛评估。实验结果表明,所提出的框架不仅输出了较小的器官形状变形的合成图像,而且还超过了最先进的域适应方法的较大边缘。
translated by 谷歌翻译
生成的对抗网络(GAN)是在众多领域成功使用的一种强大的深度学习模型。它们属于一个称为生成方法的更广泛的家族,该家族通过从真实示例中学习样本分布来生成新数据。在临床背景下,与传统的生成方法相比,GAN在捕获空间复杂,非线性和潜在微妙的疾病作用方面表现出增强的能力。这篇综述评估了有关gan在各种神经系统疾病的成像研究中的应用的现有文献,包括阿尔茨海默氏病,脑肿瘤,脑老化和多发性硬化症。我们为每个应用程序提供了各种GAN方法的直观解释,并进一步讨论了在神经影像学中利用gans的主要挑战,开放问题以及有希望的未来方向。我们旨在通过强调如何利用gan来支持临床决策,并有助于更好地理解脑部疾病的结构和功能模式,从而弥合先进的深度学习方法和神经病学研究之间的差距。
translated by 谷歌翻译
多发性硬化症(MS)是一种慢性神经系统疾病,其特征是大脑白质病变的发展。相对于其他MRI模态,T2流体减弱的反转恢复(FLAIR)脑磁共振成像(MRI)提供了MS病变的卓越可视化和表征。 MS中的后续大脑FLAIR MRI为临床医生提供了有用的信息,以监测疾病进展。在这项研究中,我们提出了对生成对抗网络(GAN)的新颖修饰,以预测MS以固定时间间隔的MS预测未来病变特异性MRI。我们在鉴别器中使用受监督的引导注意力和扩张卷积,该歧视者支持对生成图像是否实现的明智预测,这是基于对病变区域的关注,这反过来又有可能帮助改善生成器以预测病变区域将来的考试更准确。我们将我们的方法与几个基线和一种最先进的CF-Sagan模型进行了比较[1]。总之,我们的结果表明,与其他总体性能相似的模型相比,所提出的方法可实现更高的准确性,并减少病变区域预测误差的标准偏差。
translated by 谷歌翻译
多发性硬化症(MS)是中枢神经系统的慢性炎症和退行性疾病,其特征在于,白色和灰质的外观与个体患者的神经症状和标志进行地平整相关。磁共振成像(MRI)提供了详细的体内结构信息,允许定量和分类MS病变,其批判性地通知疾病管理。传统上,MS病变在2D MRI切片上手动注释,一个流程效率低,易于观察室内误差。最近,已经提出了自动统计成像分析技术以基于MRI体素强度检测和分段段病变。然而,它们的有效性受到MRI数据采集技术的异质性和MS病变的外观的限制。通过直接从图像学习复杂的病变表现,深度学习技术已经在MS病变分割任务中取得了显着的突破。在这里,我们提供了全面审查最先进的自动统计和深度学习MS分段方法,并讨论当前和未来的临床应用。此外,我们审查了域适应等技术策略,以增强现实世界临床环境中的MS病变分段。
translated by 谷歌翻译
医疗图像合成引起了人们的关注,因为它可能会产生缺失的图像数据,改善诊断并受益于许多下游任务。但是,到目前为止,开发的合成模型并不适应显示域移位的看不见的数据分布,从而限制了其在临床常规中的适用性。这项工作着重于探索3D图像到图像合成模型的域适应性(DA)。首先,我们强调了分类,分割和合成模型之间DA的技术差异。其次,我们提出了一种基于近似3D分布的2D变异自动编码器的新型有效适应方法。第三,我们介绍了有关适应数据量和关键超参数量的影响的经验研究。我们的结果表明,所提出的方法可以显着提高3D设置中未见域的合成精度。该代码可在https://github.com/winstonhutiger/2d_vae_uda_for_3d_sythesis上公开获得。
translated by 谷歌翻译
CT的精确且鲁棒的肺癌分割,即使是靠近纵隔素的CT,也需要更准确地规划和提供放疗和测量治疗反应。因此,我们开发了一种新的跨模型引发蒸馏(CMEDL)方法,使用未配对的CT和MRI扫描,由此信息教师MRI网络引导学生CT网络来提取信号,以提取信号与背景之间的差异。我们的贡献消除了蒸馏方法的两个要求:(i)通过使用图像(i2i)翻译和(ii)通过使用所有网络的并行培训来使用图像的映像(i2i)翻译和(ii)前进的训练。我们的框架使用了端到端培训的未配对I2I翻译,教师和学生分段网络。使用3个分段和2个I2I网络来证明我们框架的架构灵活性。从不同组患者的377ct和82 t2w MRI培训网络,具有独立验证(n = 209肿瘤)和测试(n = 609肿瘤)数据集。网络设计,将MRI与CT信息组合的方法,在信息(MRI至CT)下蒸馏学习,弱(CT至MRI)和平等教师(MRI至MRI)和消融测试。使用骰子相似性(DSC),表面骰子(SDSC)和Hausdorff距离测量精度,并且在95 $ ^ {Th} $百分位数(HD95)。 CMEDL方法显着(P $ <0.001)比具有CT肺肿瘤的信息教师的非CMEDL方法更准确(DSC为0.77与0.73),MRI具有弱大教师(DSC为0.84 vs.0.81) MRI多器官分割,肺肿瘤,等于教师(DSC为0.90与0.88)。 CMEDL还降低了患者间肺肿瘤细分变量。
translated by 谷歌翻译
为医学图像评估构建准确和强大的人工智能系统,不仅需要高级深度学习模型的研究和设计,还需要创建大型和策划的注释训练示例。然而,构造这种数据集通常非常昂贵 - 由于注释任务的复杂性和解释医学图像所需的高度专业知识(例如,专家放射科医师)。为了对此限制来说,我们提出了一种基于对比学习和在线特征聚类的丰富图像特征自我监督学习方法。为此目的,我们利用各种方式的大超过100,000,000个医学图像的大型训练数据集,包括放射线照相,计算机断层扫描(CT),磁共振(MR)成像和超声检查。我们建议使用这些功能来指导在各种下游任务的监督和混合自我监督/监督制度的模型培训。我们突出了这种策略对射线照相,CT和MR:1的挑战性图像评估问题的许多优点,与最先进的(例如,检测3-7%的AUC升压为3-7%胸部射线照相扫描的异常和脑CT的出血检测); 2)与使用无预先训练(例如,83%,在培训MR扫描MR扫描中的脑转移的模型时,在训练期间训练期间的模型收敛在训练期间的培训期高达85%。 3)对各种图像增强的鲁棒性增加,例如在场中看到的数据变化的强度变化,旋转或缩放反射。
translated by 谷歌翻译