智能论文笔记

Calibrate the inter-observer segmentation uncertainty via diagnosis-first principle

Junde Wu , Huihui Fang , Hoayi Xiong , Lixin Duan , Mingkui Tan , Weihua Yang , Huiying Liu , Yanwu Xu

分类：计算机视觉

2022-08-05

在医学图像上，许多组织/病变可能模棱两可。这就是为什么一群临床专家通常会注释医疗细分以减轻个人偏见的原因。但是，这种临床常规也为机器学习算法的应用带来了新的挑战。如果没有确定的基础真相，将很难训练和评估深度学习模型。当从不同的级别收集注释时，一个共同的选择是多数票。然而，这样的策略忽略了分级专家之间的差异。在本文中，我们考虑使用校准的观察者间的不确定性来预测分割的任务。我们注意到，在临床实践中，医学图像分割通常用于帮助疾病诊断。受到这一观察的启发，我们提出了诊断优先的原则，该原则是将疾病诊断作为校准观察者间分段不确定性的标准。遵循这个想法，提出了一个名为诊断的诊断框架（DIFF）以估算从原始图像中进行诊断，从原始图像进行诊断。特别是，DIFF将首先学会融合多论者分段标签，以最大程度地提高单个地面真相疾病诊断表现。我们将融合的地面真相称为诊断第一基地真实（DF-GT）。我们验证了DIFF对三个不同的医学分割任务的有效性：对眼底图像的OD/OC分割，超声图像上的甲状腺结节分割以及皮肤镜图像上的皮肤病变分割。实验结果表明，拟议的DIFF能够显着促进相应的疾病诊断，这表现优于先前的最先进的多评论者学习方法。

translated by 谷歌翻译

Opinions Vary? Diagnosis First!

Junde Wu , Huihui Fang , Dalu Yang , Zhaowei Wang , Wenshuo Zhou , Fangxin Shang , Yehui Yang , Yanwu Xu

分类：计算机视觉 | 机器学习

2022-02-14

随着深度学习技术的发展，从底眼图像中提出了越来越多的方法对视盘和杯子（OD/OC）进行分割。在临床上，多位临床专家通常会注释OD/OC细分以减轻个人偏见。但是，很难在多个标签上训练自动化的深度学习模型。解决该问题的一种普遍做法是多数投票，例如，采用多个标签的平均值。但是，这种策略忽略了医学专家的不同专家。通过观察到的观察，即在临床上通常将OD/OC分割用于青光眼诊断，在本文中，我们提出了一种新的策略，以通过青光眼诊断性能融合多评分者OD/OC分割标签。具体而言，我们通过细心的青光眼诊断网络评估每个评估者的专业性。对于每个评估者，其对诊断的贡献将被反映为专家图。为了确保对不同青光眼诊断模型的专家图是一般性的，我们进一步提出了专家生成器（EXPG），以消除优化过程中的高频组件。基于获得的专家图，多评价者标签可以融合为单个地面真相，我们将其称为诊断第一基地真相（diagfirstgt）。实验结果表明，通过将diagfirstgt用作地面真相，OD/OC分割网络将预测具有优质诊断性能的面膜。

translated by 谷歌翻译

Multi-rater Prism: Learning self-calibrated medical image segmentation from multiple raters

Junde Wu , Huihui Fang , Yehui Yang , Yuanpei Liu , Jing Gao , Lixin Duan , Weihua Yang , Yanwu Xu

分类：计算机视觉

2022-12-01

In medical image segmentation, it is often necessary to collect opinions from multiple experts to make the final decision. This clinical routine helps to mitigate individual bias. But when data is multiply annotated, standard deep learning models are often not applicable. In this paper, we propose a novel neural network framework, called Multi-Rater Prism (MrPrism) to learn the medical image segmentation from multiple labels. Inspired by the iterative half-quadratic optimization, the proposed MrPrism will combine the multi-rater confidences assignment task and calibrated segmentation task in a recurrent manner. In this recurrent process, MrPrism can learn inter-observer variability taking into account the image semantic properties, and finally converges to a self-calibrated segmentation result reflecting the inter-observer agreement. Specifically, we propose Converging Prism (ConP) and Diverging Prism (DivP) to process the two tasks iteratively. ConP learns calibrated segmentation based on the multi-rater confidence maps estimated by DivP. DivP generates multi-rater confidence maps based on the segmentation masks estimated by ConP. The experimental results show that by recurrently running ConP and DivP, the two tasks can achieve mutual improvement. The final converged segmentation result of MrPrism outperforms state-of-the-art (SOTA) strategies on a wide range of medical image segmentation tasks.

translated by 谷歌翻译

SeATrans: Learning Segmentation-Assisted diagnosis model via Transforme

Junde Wu , Huihui Fang , Fangxin Shang , Dalu Yang , Zhaowei Wang , Jing Gao , Yehui Yang , Yanwu Xu

分类：计算机视觉

2022-06-12

临床上，病变/组织的准确注释可以显着促进疾病诊断。例如，对眼底图像的视盘/杯/杯（OD/OC）的分割将有助于诊断青光眼诊断，皮肤镜图像上皮肤病变的分割有助于黑色素瘤诊断等。随着深度学习技术的发展，广泛的方法证明了病变/组织分割还可以促进自动疾病诊断模型。但是，现有方法是有限的，因为它们只能捕获图像中的静态区域相关性。受视觉变压器的全球和动态性质的启发，在本文中，我们提出了分割辅助诊断变压器（SeaTrans），以将分割知识转移到疾病诊断网络中。具体而言，我们首先提出了一种不对称的多尺度相互作用策略，以将每个单个低级诊断功能与多尺度分割特征相关联。然后，采用了一种称为海块的有效策略，以通过相关的分割特征使诊断特征生命。为了模拟分割诊断的相互作用，海块首先根据分段信息通过编码器嵌入诊断功能，然后通过解码器将嵌入的嵌入回到诊断功能空间中。实验结果表明，关于几种疾病诊断任务的海洋侵蚀超过了广泛的最新（SOTA）分割辅助诊断方法。

translated by 谷歌翻译

Learning self-calibrated optic disc and cup segmentation from multi-rater annotations

Junde Wu , Huihui Fang , Fangxin Shang , Zhaowei Wang , Dalu Yang , Wenshuo Zhou , Yehui Yang , Yanwu Xu

分类：计算机视觉

2022-06-10

眼底图像的视盘（OD）和视杯（OC）的分割是青光眼诊断的重要基本任务。在临床实践中，通常有必要从多位专家那里收集意见，以获得最终的OD/OC注释。这种临床常规有助于减轻单个偏见。但是，当数据乘以注释时，标准深度学习模型将不适用。在本文中，我们提出了一个新型的神经网络框架，以从多评价者注释中学习OD/OC分割。分割结果通过迭代优化多评价专家的估计和校准OD/OC分割来自校准。这样，提出的方法可以实现这两个任务的相互改进，并最终获得精制的分割结果。具体而言，我们提出分化模型（DIVM）和收敛模型（CONM）分别处理这两个任务。 CONM基于DIVM提供的多评价专家图的原始图像。 DIVM从CONM提供的分割掩码中生成多评价者专家图。实验结果表明，通过经常运行CONM和DIVM，可以对结果进行自校准，从而超过一系列最新的（SOTA）多评价者分割方法。

translated by 谷歌翻译

DRG-Net: Interactive Joint Learning of Multi-lesion Segmentation and Classification for Diabetic Retinopathy Grading

Hasan Md Tusfiqur , Duy M. H. Nguyen , Mai T. N. Truong , Triet A. Nguyen , Binh T. Nguyen , Michael Barz , Hans-Juergen Profitlich , Ngoc T. T. Than , Ngan Le , Pengtao Xie

分类：计算机视觉

2022-12-30

Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.

translated by 谷歌翻译

DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation

Feilong Tang , Qiming Huang , Jinfeng Wang , Xianxu Hou , Jionglong Su , Jingxin Liu

分类：计算机视觉

2022-12-21

Transformer-based models have been widely demonstrated to be successful in computer vision tasks by modelling long-range dependencies and capturing global representations. However, they are often dominated by features of large patterns leading to the loss of local details (e.g., boundaries and small objects), which are critical in medical image segmentation. To alleviate this problem, we propose a Dual-Aggregation Transformer Network called DuAT, which is characterized by two innovative designs, namely, the Global-to-Local Spatial Aggregation (GLSA) and Selective Boundary Aggregation (SBA) modules. The GLSA has the ability to aggregate and represent both global and local spatial features, which are beneficial for locating large and small objects, respectively. The SBA module is used to aggregate the boundary characteristic from low-level features and semantic information from high-level features for better preserving boundary details and locating the re-calibration objects. Extensive experiments in six benchmark datasets demonstrate that our proposed model outperforms state-of-the-art methods in the segmentation of skin lesion images, and polyps in colonoscopy images. In addition, our approach is more robust than existing methods in various challenging situations such as small object segmentation and ambiguous object boundaries.

translated by 谷歌翻译

Knowledge-aware Deep Framework for Collaborative Skin Lesion Segmentation and Melanoma Recognition

Xiaohong Wang , Xudong Jiang , Henghui Ding , Yuqian Zhao , Jun Liu

分类：计算机视觉

2021-06-07

深度学习技术表明它们在皮肤科医生临床检查中的优越性。然而，由于难以将临床知识掺入学习过程中，黑色素瘤诊断仍然是一个具有挑战性的任务。在本文中，我们提出了一种新颖的知识意识的深度框架，将一些临床知识纳入两个重要的黑色素瘤诊断任务的协作学习，即皮肤病变分割和黑色素瘤识别。具体地，利用病变区的形态表达的知识以及黑色素瘤鉴定的周边区域，设计了一种基于病变的汇集和形状提取（LPSE）方案，其将从皮肤病变分段获得的结构信息转移到黑色素瘤识别中。同时，为了通过黑色素瘤识别到皮肤病变细分的皮肤病原诊断知识，设计了有效的诊断引导特征融合（DGFF）策略。此外，我们提出了一种递归相互学习机制，进一步促进任务间合作，因此迭代地提高了皮肤病病变分割和黑色素瘤识别模型的联合学习能力。两种公共皮肤病原数据集的实验结果表明了黑色素瘤分析方法的有效性。

translated by 谷歌翻译

REFUGE2 Challenge: A Treasure Trove for Multi-Dimension Analysis and Evaluation in Glaucoma Screening

Huihui Fang , Fei Li , Junde Wu , Huazhu Fu , Xu Sun , Jaemin Son , Shuang Yu , Menglu Zhang , Chenglang Yuan , Cheng Bian

分类：计算机视觉

2022-02-18

With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge -- Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application.

translated by 谷歌翻译

Stain-Adaptive Self-Supervised Learning for Histopathology Image Analysis

Hai-Li Ye , Da-Han Wang

分类：计算机视觉

2022-08-08

人们普遍认为，污渍差异引起的颜色变化是组织病理学图像分析的关键问题。现有方法采用颜色匹配，染色分离，污渍转移或它们的组合以减轻污渍变化问题。在本文中，我们提出了一种用于组织病理学图像分析的新型染色自适应自我监督学习（SASSL）方法。我们的SASSL将一个域 - 交流训练模块集成到SSL框架中，以学习独特的特征，这些功能对各种转换和污渍变化都具有鲁棒性。所提出的SASSL被视为域不变特征提取的一般方法，可以通过对特定下游任务的特征进行细微调整特征来灵活地与任意下游组织病理学图像分析模块（例如核/组织分割）结合。我们进行了有关公开可用的病理图像分析数据集的实验，包括熊猫，乳腺癌和camelyon16数据集，以实现最先进的性能。实验结果表明，所提出的方法可以鲁棒地提高模型的特征提取能力，并在下游任务中实现稳定的性能改善。

translated by 谷歌翻译

MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model

Junde Wu , Huihui Fang , Yu Zhang , Yehui Yang , Yanwu Xu

分类：计算机视觉

2022-11-01

Diffusion probabilistic model (DPM) recently becomes one of the hottest topic in computer vision. Its image generation application such as Imagen, Latent Diffusion Models and Stable Diffusion have shown impressive generation capabilities, which aroused extensive discussion in the community. Many recent studies also found it useful in many other vision tasks, like image deblurring, super-resolution and anomaly detection. Inspired by the success of DPM, we propose the first DPM based model toward general medical image segmentation tasks, which we named MedSegDiff. In order to enhance the step-wise regional attention in DPM for the medical image segmentation, we propose dynamic conditional encoding, which establishes the state-adaptive conditions for each sampling step. We further propose Feature Frequency Parser (FF-Parser), to eliminate the negative effect of high-frequency noise component in this process. We verify MedSegDiff on three medical segmentation tasks with different image modalities, which are optic cup segmentation over fundus images, brain tumor segmentation over MRI images and thyroid nodule segmentation over ultrasound images. The experimental results show that MedSegDiff outperforms state-of-the-art (SOTA) methods with considerable performance gap, indicating the generalization and effectiveness of the proposed model.

translated by 谷歌翻译

TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism for a Deep Segmentation Model

Reza Azad , Mohammad T. AL-Antary , Moein Heidari , Dorit Merhof

分类：计算机视觉

2022-07-27

在过去的几年中，卷积神经网络（CNN），尤其是U-NET，一直是医学图像处理时代的流行技术。具体而言，开创性的U-NET及其替代方案成功地设法解决了各种各样的医学图像分割任务。但是，这些体系结构在本质上是不完美的，因为它们无法表现出长距离相互作用和空间依赖性，从而导致具有可变形状和结构的医学图像分割的严重性能下降。针对序列到序列预测的初步提议的变压器已成为替代体系结构，以精确地模拟由自我激进机制辅助的全局信息。尽管设计了可行的设计，但利用纯变压器来进行图像分割目的，可能导致限制的定位容量，导致低级功能不足。因此，一系列研究旨在设计基于变压器的U-NET的强大变体。在本文中，我们提出了Trans-Norm，这是一种新型的深层分割框架，它随同将变压器模块合并为标准U-NET的编码器和跳过连接。我们认为，跳过连接的方便设计对于准确的分割至关重要，因为它可以帮助扩展路径和收缩路径之间的功能融合。在这方面，我们从变压器模块中得出了一种空间归一化机制，以适应性地重新校准跳过连接路径。对医学图像分割的三个典型任务进行了广泛的实验，证明了透气的有效性。代码和训练有素的模型可在https://github.com/rezazad68/transnorm上公开获得。

translated by 谷歌翻译

Semantic decomposition Network with Contrastive and Structural Constraints for Dental Plaque Segmentation

Jian Shi , Baoli Sun , Xinchen Ye , Zhihui Wang , Xiaolong Luo , Jin Liu , Heli Gao , Haojie Li

分类：计算机视觉 | 人工智能

2022-08-12

从医用试剂染色图像中分割牙齿斑块为诊断和确定随访治疗计划提供了宝贵的信息。但是，准确的牙菌斑分割是一项具有挑战性的任务，需要识别牙齿和牙齿斑块受到语义腔区域的影响（即，在牙齿和牙齿斑块之间的边界区域中存在困惑的边界）以及实例形状的复杂变化，这些变化均未完全解决。现有方法。因此，我们提出了一个语义分解网络（SDNET），该网络介绍了两个单任务分支，以分别解决牙齿和牙齿斑块的分割，并设计了其他约束，以学习每个分支的特定类别特征，从而促进语义分解并改善该类别的特征牙齿分割的性能。具体而言，SDNET以分裂方式学习了两个单独的分割分支和牙齿的牙齿，以解除它们之间的纠缠关系。指定类别的每个分支都倾向于产生准确的分割。为了帮助这两个分支更好地关注特定类别的特征，进一步提出了两个约束模块：1）通过最大化不同类别表示之间的距离来学习判别特征表示，以了解判别特征表示形式，以减少减少负面影响关于特征提取的语义腔区域； 2）结构约束模块（SCM）通过监督边界感知的几何约束提供完整的结构信息，以提供各种形状的牙菌斑。此外，我们构建了一个大规模的开源染色牙菌斑分割数据集（SDPSEG），该数据集为牙齿和牙齿提供高质量的注释。 SDPSEG数据集的实验结果显示SDNET达到了最新的性能。

translated by 谷歌翻译

Transformers in Medical Image Analysis: A Review

Kelei He , Chen Gan , Zhuoyuan Li , Islem Rekik , Zihao Yin , Wen Ji , Yang Gao , Qian Wang , Junfeng Zhang , Dinggang Shen

分类：计算机视觉

2022-02-24

变形金刚占据了自然语言处理领域，最近影响了计算机视觉区域。在医学图像分析领域中，变压器也已成功应用于全栈临床应用，包括图像合成/重建，注册，分割，检测和诊断。我们的论文旨在促进变压器在医学图像分析领域的认识和应用。具体而言，我们首先概述了内置在变压器和其他基本组件中的注意机制的核心概念。其次，我们回顾了针对医疗图像应用程序量身定制的各种变压器体系结构，并讨论其局限性。在这篇综述中，我们调查了围绕在不同学习范式中使用变压器，提高模型效率及其与其他技术的耦合的关键挑战。我们希望这篇评论可以为读者提供医学图像分析领域的读者的全面图片。

translated by 谷歌翻译

GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges

Junde Wu , Huihui Fang , Fei Li , Huazhu Fu , Fengbin Lin , Jiongcheng Li , Lexing Huang , Qinji Yu , Sifan Song , Xinxing Xu

分类：计算机视觉

2022-02-14

Color fundus photography and Optical Coherence Tomography (OCT) are the two most cost-effective tools for glaucoma screening. Both two modalities of images have prominent biomarkers to indicate glaucoma suspected. Clinically, it is often recommended to take both of the screenings for a more accurate and reliable diagnosis. However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment. Inspired by the success of Retinal Fundus Glaucoma Challenge (REFUGE) we held previously, we set up the Glaucoma grAding from Multi-Modality imAges (GAMMA) Challenge to encourage the development of fundus \& OCT-based glaucoma grading. The primary task of the challenge is to grade glaucoma from both the 2D fundus images and 3D OCT scanning volumes. As part of GAMMA, we have publicly released a glaucoma annotated dataset with both 2D fundus color photography and 3D OCT volumes, which is the first multi-modality dataset for glaucoma grading. In addition, an evaluation framework is also established to evaluate the performance of the submitted methods. During the challenge, 1272 results were submitted, and finally, top-10 teams were selected to the final stage. We analysis their results and summarize their methods in the paper. Since all these teams submitted their source code in the challenge, a detailed ablation study is also conducted to verify the effectiveness of the particular modules proposed. We find many of the proposed techniques are practical for the clinical diagnosis of glaucoma. As the first in-depth study of fundus \& OCT multi-modality glaucoma grading, we believe the GAMMA Challenge will be an essential starting point for future research.

translated by 谷歌翻译

Medical Image Segmentation Using Deep Learning: A Survey

Risheng Wang , Tao Lei , Ruixia Cui , Bingtao Zhang , Hongying Meng , Asoke K. Nandi

分类：计算机视觉

2020-09-28

深度学习已被广泛用于医学图像分割，并且录制了录制了该领域深度学习的成功的大量论文。在本文中，我们使用深层学习技术对医学图像分割的全面主题调查。本文进行了两个原创贡献。首先，与传统调查相比，直接将深度学习的文献分成医学图像分割的文学，并为每组详细介绍了文献，我们根据从粗略到精细的多级结构分类目前流行的文献。其次，本文侧重于监督和弱监督的学习方法，而不包括无监督的方法，因为它们在许多旧调查中引入而且他们目前不受欢迎。对于监督学习方法，我们分析了三个方面的文献：骨干网络的选择，网络块的设计，以及损耗功能的改进。对于虚弱的学习方法，我们根据数据增强，转移学习和交互式分割进行调查文献。与现有调查相比，本调查将文献分类为比例不同，更方便读者了解相关理由，并将引导他们基于深度学习方法思考医学图像分割的适当改进。

translated by 谷歌翻译

MyoPS-Net: Myocardial Pathology Segmentation with Flexible Combination of Multi-Sequence CMR Images

Junyi Qiu , Lei Li , Sihan Wang , Ke Zhang , Yinyin Chen , Shan Yang , Xiahai Zhuang

分类：计算机视觉

2022-11-06

Myocardial pathology segmentation (MyoPS) can be a prerequisite for the accurate diagnosis and treatment planning of myocardial infarction. However, achieving this segmentation is challenging, mainly due to the inadequate and indistinct information from an image. In this work, we develop an end-to-end deep neural network, referred to as MyoPS-Net, to flexibly combine five-sequence cardiac magnetic resonance (CMR) images for MyoPS. To extract precise and adequate information, we design an effective yet flexible architecture to extract and fuse cross-modal features. This architecture can tackle different numbers of CMR images and complex combinations of modalities, with output branches targeting specific pathologies. To impose anatomical knowledge on the segmentation results, we first propose a module to regularize myocardium consistency and localize the pathologies, and then introduce an inclusiveness loss to utilize relations between myocardial scars and edema. We evaluated the proposed MyoPS-Net on two datasets, i.e., a private one consisting of 50 paired multi-sequence CMR images and a public one from MICCAI2020 MyoPS Challenge. Experimental results showed that MyoPS-Net could achieve state-of-the-art performance in various scenarios. Note that in practical clinics, the subjects may not have full sequences, such as missing LGE CMR or mapping CMR scans. We therefore conducted extensive experiments to investigate the performance of the proposed method in dealing with such complex combinations of different CMR sequences. Results proved the superiority and generalizability of MyoPS-Net, and more importantly, indicated a practical clinical application.

translated by 谷歌翻译

ExpNet: A unified network for Expert-Level Classification

Junde Wu , Huihui Fang , Yehui Yang , Yu Zhang , Haoyi Xiong , Huazhu Fu , Yanwu Xu

分类：计算机视觉

2022-11-29

Different from the general visual classification, some classification tasks are more challenging as they need the professional categories of the images. In the paper, we call them expert-level classification. Previous fine-grained vision classification (FGVC) has made many efforts on some of its specific sub-tasks. However, they are difficult to expand to the general cases which rely on the comprehensive analysis of part-global correlation and the hierarchical features interaction. In this paper, we propose Expert Network (ExpNet) to address the unique challenges of expert-level classification through a unified network. In ExpNet, we hierarchically decouple the part and context features and individually process them using a novel attentive mechanism, called Gaze-Shift. In each stage, Gaze-Shift produces a focal-part feature for the subsequent abstraction and memorizes a context-related embedding. Then we fuse the final focal embedding with all memorized context-related embedding to make the prediction. Such an architecture realizes the dual-track processing of partial and global information and hierarchical feature interactions. We conduct the experiments over three representative expert-level classification tasks: FGVC, disease classification, and artwork attributes classification. In these experiments, superior performance of our ExpNet is observed comparing to the state-of-the-arts in a wide range of fields, indicating the effectiveness and generalization of our ExpNet. The code will be made publicly available.

translated by 谷歌翻译

Exploring dual-attention mechanism with multi-scale feature extraction scheme for skin lesion segmentation

G Jignesh Chowdary , G V S N Durga Yathisha , Suganya G , Premalatha M

分类：计算机视觉

2021-11-16

由于不规则的病变界限，病变与背景之间的对比度较差，以及伪影之间的对比度，皮肤病的自动分割是一种具有挑战性的任务。在这项工作中，提出了一种新的卷积神经网络的方法，用于皮肤病变分割。在这项工作中，提出了一种新型多尺度特征提取模块，用于提取更多辨别特征，以处理与复杂的皮肤病变有关的挑战;该模块嵌入在UNET中，替换标准架构中的卷积层。此外，在这项工作中，两个不同的关注机制完善了编码器提取的特征和后ups采样的特征。使用两个公开的数据集进行评估，包括ISBI2017和ISIC2018数据集。该方法报告了ISBI2017数据集中的准确性，召回和JSI，97.5％，94.29％，91.16％，95.92％，95.92％，95.37％，95.37％，91.52％在ISIC2018数据集。它在各个竞争中表现出现有的方法和排名的模型。

translated by 谷歌翻译

RadFormer: Transformers with Global-Local Attention for Interpretable and Accurate Gallbladder Cancer Detection

Soumen Basu , Mayank Gupta , Pratyaksha Rana , Pankaj Gupta , Chetan Arora

分类：计算机视觉

2022-11-09

We propose a novel deep neural network architecture to learn interpretable representation for medical image analysis. Our architecture generates a global attention for region of interest, and then learns bag of words style deep feature embeddings with local attention. The global, and local feature maps are combined using a contemporary transformer architecture for highly accurate Gallbladder Cancer (GBC) detection from Ultrasound (USG) images. Our experiments indicate that the detection accuracy of our model beats even human radiologists, and advocates its use as the second reader for GBC diagnosis. Bag of words embeddings allow our model to be probed for generating interpretable explanations for GBC detection consistent with the ones reported in medical literature. We show that the proposed model not only helps understand decisions of neural network models but also aids in discovery of new visual features relevant to the diagnosis of GBC. Source-code and model will be available at https://github.com/sbasu276/RadFormer

translated by 谷歌翻译