Medical image segmentation (MIS) is essential for supporting disease diagnosis and treatment effect assessment. Despite considerable advances in artificial intelligence (AI) for MIS, clinicians remain skeptical of its utility, maintaining low confidence in such black box systems, with this problem being exacerbated by low generalization for out-of-distribution (OOD) data. To move towards effective clinical utilization, we propose a foundation model named EvidenceCap, which makes the box transparent in a quantifiable way by uncertainty estimation. EvidenceCap not only makes AI visible in regions of uncertainty and OOD data, but also enhances the reliability, robustness, and computational efficiency of MIS. Uncertainty is modeled explicitly through subjective logic theory to gather strong evidence from features. We show the effectiveness of EvidenceCap in three segmentation datasets and apply it to the clinic. Our work sheds light on clinical safe applications and explainable AI, and can contribute towards trustworthiness in the medical domain.
translated by 谷歌翻译
Focusing on the complicated pathological features, such as blurred boundaries, severe scale differences between symptoms, background noise interference, etc., in the task of retinal edema lesions joint segmentation from OCT images and enabling the segmentation results more reliable. In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network, which can provide accurate segmentation results with reliability assessment. Specifically, aiming at improving the model's ability to learn the complex pathological features of retinal edema lesions in OCT images, we develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module of our newly designed. Meanwhile, to make the segmentation results more reliable, a novel uncertainty segmentation head based on the subjective logical evidential theory is introduced to generate the final segmentation results with a corresponding overall uncertainty evaluation score map. We conduct comprehensive experiments on the public database of AI-Challenge 2018 for retinal edema lesions segmentation, and the results show that our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches. The code will be released on: https://github.com/LooKing9218/ReliableRESeg.
translated by 谷歌翻译
Different from the general visual classification, some classification tasks are more challenging as they need the professional categories of the images. In the paper, we call them expert-level classification. Previous fine-grained vision classification (FGVC) has made many efforts on some of its specific sub-tasks. However, they are difficult to expand to the general cases which rely on the comprehensive analysis of part-global correlation and the hierarchical features interaction. In this paper, we propose Expert Network (ExpNet) to address the unique challenges of expert-level classification through a unified network. In ExpNet, we hierarchically decouple the part and context features and individually process them using a novel attentive mechanism, called Gaze-Shift. In each stage, Gaze-Shift produces a focal-part feature for the subsequent abstraction and memorizes a context-related embedding. Then we fuse the final focal embedding with all memorized context-related embedding to make the prediction. Such an architecture realizes the dual-track processing of partial and global information and hierarchical feature interactions. We conduct the experiments over three representative expert-level classification tasks: FGVC, disease classification, and artwork attributes classification. In these experiments, superior performance of our ExpNet is observed comparing to the state-of-the-arts in a wide range of fields, indicating the effectiveness and generalization of our ExpNet. The code will be made publicly available.
translated by 谷歌翻译
Automated detecting lung infections from computed tomography (CT) data plays an important role for combating COVID-19. However, there are still some challenges for developing AI system. 1) Most current COVID-19 infection segmentation methods mainly relied on 2D CT images, which lack 3D sequential constraint. 2) Existing 3D CT segmentation methods focus on single-scale representations, which do not achieve the multiple level receptive field sizes on 3D volume. 3) The emergent breaking out of COVID-19 makes it hard to annotate sufficient CT volumes for training deep model. To address these issues, we first build a multiple dimensional-attention convolutional neural network (MDA-CNN) to aggregate multi-scale information along different dimension of input feature maps and impose supervision on multiple predictions from different CNN layers. Second, we assign this MDA-CNN as a basic network into a novel dual multi-scale mean teacher network (DM${^2}$T-Net) for semi-supervised COVID-19 lung infection segmentation on CT volumes by leveraging unlabeled data and exploring the multi-scale information. Our DM${^2}$T-Net encourages multiple predictions at different CNN layers from the student and teacher networks to be consistent for computing a multi-scale consistency loss on unlabeled data, which is then added to the supervised loss on the labeled data from multiple predictions of MDA-CNN. Third, we collect two COVID-19 segmentation datasets to evaluate our method. The experimental results show that our network consistently outperforms the compared state-of-the-art methods.
translated by 谷歌翻译
Localizing anatomical landmarks are important tasks in medical image analysis. However, the landmarks to be localized often lack prominent visual features. Their locations are elusive and easily confused with the background, and thus precise localization highly depends on the context formed by their surrounding areas. In addition, the required precision is usually higher than segmentation and object detection tasks. Therefore, localization has its unique challenges different from segmentation or detection. In this paper, we propose a zoom-in attentive network (ZIAN) for anatomical landmark localization in ocular images. First, a coarse-to-fine, or "zoom-in" strategy is utilized to learn the contextualized features in different scales. Then, an attentive fusion module is adopted to aggregate multi-scale features, which consists of 1) a co-attention network with a multiple regions-of-interest (ROIs) scheme that learns complementary features from the multiple ROIs, 2) an attention-based fusion module which integrates the multi-ROIs features and non-ROI features. We evaluated ZIAN on two open challenge tasks, i.e., the fovea localization in fundus images and scleral spur localization in AS-OCT images. Experiments show that ZIAN achieves promising performances and outperforms state-of-the-art localization methods. The source code and trained models of ZIAN are available at https://github.com/leixiaofeng-astar/OMIA9-ZIAN.
translated by 谷歌翻译
组织了伽马挑战赛,以鼓励AI模型从2D眼睛图像和3D光学相干断层扫描量的组合(如眼科医生)中筛选出青光眼。
translated by 谷歌翻译
无监督的域适应性(UDA)引起了相当大的关注,这将知识从富含标签的源域转移到相关但未标记的目标域。减少域间差异一直是提高UDA性能的关键因素,尤其是对于源域和目标域之间存在较大差距的任务。为此,我们提出了一种新颖的风格感知功能融合方法(SAFF),以弥合大域间隙和转移知识,同时减轻阶级歧视性信息的丧失。受到人类传递推理和学习能力的启发,研究了一种新颖的风格感知的自我互化领域(SSID),通过一系列中级辅助综合概念将两个看似无关的概念联系起来。具体而言,我们提出了一种新颖的SSID学习策略,该策略从源和目标域中选择样本作为锚点,然后随机融合这些锚的对象和样式特征,以生成具有标记和样式丰富的中级辅助功能以进行知识转移。此外,我们设计了一个外部存储库来存储和更新指定的标记功能,以获得稳定的类功能和班级样式功能。基于提议的内存库,内部和域间损耗功能旨在提高类识别能力和特征兼容性。同时,我们通过无限抽样模拟SSID的丰富潜在特征空间,并通过数学理论模拟损失函数的收敛性。最后,我们对常用的域自适应基准测试进行了全面的实验,以评估所提出的SAFF,并且实验结果表明,所提出的SAFF可以轻松地与不同的骨干网络结合在一起,并获得更好的性能作为插入插型模块。
translated by 谷歌翻译
青光眼会导致视力神经损害导致不可逆的视力丧失,并且无法治愈青光眼。OCT成像方式是评估青光眼损害的重要技术,因为它有助于量化底底结构。为了促进对青光眼的OCT辅助诊断领域中对AI技术的研究,我们在国际医学图像计算和计算机辅助干预(MICCAI)2022的国际会议上进行了青光眼OCT分析和层分段(目标)挑战(目标)挑战(目标)挑战。提供数据和相应的注释,以研究从OCT图像和青光眼分类研究层分割的研究人员。本文介绍了已发布的300个圆形八十个OCT图像,两个子任务的基线以及评估方法。可以在https://aistudio.baidu.com/aistudio/competition/competition/detail/230上访问目标挑战。
translated by 谷歌翻译
超声检查中的乳腺病变检测对于乳腺癌诊断至关重要。现有方法主要依赖于单独的2D超声图像或组合未标记的视频和标记为2D图像以训练模型以进行乳腺病变检测。在本文中,我们首先收集并注释一个超声视频数据集(188个视频),以进行乳腺病变检测。此外,我们通过汇总视频级别的病变分类功能和剪辑级的时间功能来解决超声视频中乳房病变检测的解决剪辑级和视频级特征聚合网络(CVA-NET)。剪辑级的时间功能特征编码有序视频框架的本地时间信息和洗牌视频帧的全局时间信息。在我们的CVA-NET中,设计了一个Inter-Video融合模块,以融合原始视频框架的本地功能以及从洗牌视频帧中的全局功能,并设计了一个内部视频融合模块,以学习相邻视频框架之间的时间信息。此外,我们学习视频水平功能,以将原始视频的乳房病变分类为良性或恶性病变,以进一步增强超声视频中最终的乳房病变检测性能。我们注释数据集的实验结果表明,我们的CVA-NET显然优于最先进的方法。相应的代码和数据集可在\ url {https://github.com/jhl-det/cva-net}上公开获得。
translated by 谷歌翻译
尽管脑肿瘤分割的准确性最近有所提高,但结果仍然表现出较低的置信度和稳健性。不确定性估计是改变这种情况的一种有效方法,因为它提供了对分割结果的信心。在本文中,我们提出了一个可信赖的脑肿瘤分割网络,该网络可以产生可靠的分割结果和可靠的不确定性估计,而不会过多的计算负担和骨干网络的修改。在我们的方法中,不确定性是使用主观逻辑理论明确建模的,该理论将主干神经网络的预测视为主观观点,通过将分割的类概率参数视为差异分布。同时,可信赖的分割框架学习了从功能中收集可靠证据的功能,从而导致最终分割结果。总体而言,我们统一的可信赖分割框架使该模型具有可靠性和鲁棒性,对分布式样本。为了评估我们的模型在鲁棒性和可靠性方面的有效性,在Brats 2019数据集中进行了定性和定量实验。
translated by 谷歌翻译