智能论文笔记

Accounting for Dependencies in Deep Learning Based Multiple Instance Learning for Whole Slide Imaging

Andriy Myronenko , Ziyue Xu , Dong Yang , Holger Roth , Daguang Xu

分类：计算机视觉

2021-11-01

多实例学习（MIL）是整个幻灯片图像（WSI）分类的关键算法。组织学WSIS可以具有数十亿像素，它创造了巨大的计算和注释挑战。通常，这种图像被分成一组贴片（一袋实例），其中仅提供袋级类标签。基于深度学习的MIL方法使用卷积神经网络（CNN）计算实例特征。我们所提出的方法也是基于深度学习的，随着以下两项贡献例如，肿瘤等级可以取决于WSI中不同位置的几种特定模式的存在，这需要考虑贴片之间的依赖性。其次，我们提出了基于实例伪标签的实例 - 明智函数。我们将所提出的算法与多个基线方法进行比较，在熊猫挑战数据集上评估它，该数据集是超过11K图像的最大可用的WSI数据集，并展示最先进的结果。

translated by 谷歌翻译

Towards Label-efficient Automatic Diagnosis and Analysis: A Comprehensive Survey of Advanced Deep Learning-based Weakly-supervised, Semi-supervised and Self-supervised Techniques in Histopathological Image Analysis

Linhao Qu , Siyu Liu , Xiaoyu Liu , Manning Wang , Zhijian Song

分类：计算机视觉

2022-08-18

组织病理学图像包含丰富的表型信息和病理模式，这是疾病诊断的黄金标准，对于预测患者预后和治疗结果至关重要。近年来，在临床实践中迫切需要针对组织病理学图像的计算机自动化分析技术，而卷积神经网络代表的深度学习方法已逐渐成为数字病理领域的主流。但是，在该领域获得大量细粒的注释数据是一项非常昂贵且艰巨的任务，这阻碍了基于大量注释数据的传统监督算法的进一步开发。最新的研究开始从传统的监督范式中解放出来，最有代表性的研究是基于弱注释，基于有限的注释的半监督学习范式以及基于自我监督的学习范式的弱监督学习范式的研究图像表示学习。这些新方法引发了针对注释效率的新自动病理图像诊断和分析。通过对130篇论文的调查，我们对从技术和方法论的角度来看，对计算病理学领域中有关弱监督学习，半监督学习以及自我监督学习的最新研究进行了全面的系统综述。最后，我们提出了这些技术的关键挑战和未来趋势。

translated by 谷歌翻译

Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images

Mohammad Mahdi Behzadi , Mohammad Madani , Hanzhang Wang , Jun Bai , Ankit Bhardwaj , Anna Tarakanova , Harold Yamase , Ga Hie Nam , Sheida Nabavi

分类：计算机视觉

2022-12-25

Prostate cancer is the most common cancer in men worldwide and the second leading cause of cancer death in the United States. One of the prognostic features in prostate cancer is the Gleason grading of histopathology images. The Gleason grade is assigned based on tumor architecture on Hematoxylin and Eosin (H&E) stained whole slide images (WSI) by the pathologists. This process is time-consuming and has known interobserver variability. In the past few years, deep learning algorithms have been used to analyze histopathology images, delivering promising results for grading prostate cancer. However, most of the algorithms rely on the fully annotated datasets which are expensive to generate. In this work, we proposed a novel weakly-supervised algorithm to classify prostate cancer grades. The proposed algorithm consists of three steps: (1) extracting discriminative areas in a histopathology image by employing the Multiple Instance Learning (MIL) algorithm based on Transformers, (2) representing the image by constructing a graph using the discriminative patches, and (3) classifying the image into its Gleason grades by developing a Graph Convolutional Neural Network (GCN) based on the gated attention mechanism. We evaluated our algorithm using publicly available datasets, including TCGAPRAD, PANDA, and Gleason 2019 challenge datasets. We also cross validated the algorithm on an independent dataset. Results show that the proposed model achieved state-of-the-art performance in the Gleason grading task in terms of accuracy, F1 score, and cohen-kappa. The code is available at https://github.com/NabaviLab/Prostate-Cancer.

translated by 谷歌翻译

DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification

Linhao Qu , Xiaoyuan Luo , Shaolei Liu , Manning Wang , Zhijian Song

分类：计算机视觉

2022-06-17

多个实例学习（MIL）广泛用于分析组织病理学全幻灯片图像（WSIS）。但是，现有的MIL方法不会明确地对数据分配进行建模，而仅通过训练分类器来歧视行李级或实例级决策边界。在本文中，我们提出了DGMIL：一个特征分布引导为WSI分类和阳性贴剂定位的深度MIL框架。我们没有设计复杂的判别网络体系结构，而是揭示组织病理学图像数据的固有特征分布可以作为分类的非常有效的指南。我们提出了一种集群条件的特征分布建模方法和基于伪标签的迭代特征空间改进策略，以便在最终特征空间中，正面和负面实例可以轻松分离。 CamelyOn16数据集和TCGA肺癌数据集的实验表明，我们的方法为全球分类和阳性贴剂定位任务提供了新的SOTA。

translated by 谷歌翻译

Hierarchical Transformer for Survival Prediction Using Multimodality Whole Slide Images and Genomics

Chunyuan Li , Xinliang Zhu , Jiawen Yao , Junzhou Huang

分类：计算机视觉 | 机器学习

2022-11-29

Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical. Previous studies employ multiple instance learning (MIL) to represent WSIs as bags of sampled patches because, for most occasions, only slide-level labels are available, and only a tiny region of the WSI is disease-positive area. However, WSI representation learning still remains an open problem due to: (1) patch sampling on a higher resolution may be incapable of depicting microenvironment information such as the relative position between the tumor cells and surrounding tissues, while patches at lower resolution lose the fine-grained detail; (2) extracting patches from giant WSI results in large bag size, which tremendously increases the computational cost. To solve the problems, this paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes. Precisely, we randomly extract instant-level patch features from WSIs with different magnification. Then a co-attention mapping between imaging and genomics is learned to uncover the pairwise interaction and reduce the space complexity of imaging features. Such early fusion makes it computationally feasible to use MIL Transformer for the survival prediction task. Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability. We evaluate our approach on five cancer types from the Cancer Genome Atlas database and achieved an average c-index of $0.673$, outperforming the state-of-the-art multimodality methods.

translated by 谷歌翻译

Deep Weakly-Supervised Learning Methods for Classification and Localization in Histology Images: A Survey

Jérôme Rony , Soufiane Belharbi , Jose Dolz , Ismail Ben Ayed , Luke McCaffrey , Eric Granger

分类：计算机视觉 | 机器学习

2019-09-08

使用深度学习模型从组织学数据中诊断癌症提出了一些挑战。这些图像中关注区域（ROI）的癌症分级和定位通常依赖于图像和像素级标签，后者需要昂贵的注释过程。深度弱监督的对象定位（WSOL）方法为深度学习模型的低成本培训提供了不同的策略。仅使用图像级注释，可以训练这些方法以对图像进行分类，并为ROI定位进行分类类激活图（CAM）。本文综述了WSOL的最先进的DL方法。我们提出了一种分类法，根据模型中的信息流，将这些方法分为自下而上和自上而下的方法。尽管后者的进展有限，但最近的自下而上方法目前通过深层WSOL方法推动了很多进展。早期作品的重点是设计不同的空间合并功能。但是，这些方法达到了有限的定位准确性，并揭示了一个主要限制 - 凸轮的不足激活导致了高假阴性定位。随后的工作旨在减轻此问题并恢复完整的对象。评估和比较了两个具有挑战性的组织学数据集的分类和本地化准确性，对我们的分类学方法进行了评估和比较。总体而言，结果表明定位性能差，特别是对于最初设计用于处理自然图像的通用方法。旨在解决组织学数据挑战的方法产生了良好的结果。但是，所有方法都遭受高假阳性/阴性定位的影响。在组织学中应用深WSOL方法的应用是四个关键的挑战 - 凸轮的激活下/过度激活，对阈值的敏感性和模型选择。

translated by 谷歌翻译

Transformers in Medical Image Analysis: A Review

Kelei He , Chen Gan , Zhuoyuan Li , Islem Rekik , Zihao Yin , Wen Ji , Yang Gao , Qian Wang , Junfeng Zhang , Dinggang Shen

分类：计算机视觉

2022-02-24

变形金刚占据了自然语言处理领域，最近影响了计算机视觉区域。在医学图像分析领域中，变压器也已成功应用于全栈临床应用，包括图像合成/重建，注册，分割，检测和诊断。我们的论文旨在促进变压器在医学图像分析领域的认识和应用。具体而言，我们首先概述了内置在变压器和其他基本组件中的注意机制的核心概念。其次，我们回顾了针对医疗图像应用程序量身定制的各种变压器体系结构，并讨论其局限性。在这篇综述中，我们调查了围绕在不同学习范式中使用变压器，提高模型效率及其与其他技术的耦合的关键挑战。我们希望这篇评论可以为读者提供医学图像分析领域的读者的全面图片。

translated by 谷歌翻译

Revisiting Whole-Slide Image Pyramids for Cancer Prognosis via Dual-Stream Networks

Pei Liu , Bo Fu , Feng Ye , Rui Yang , Bin Xu , Luping Ji

分类：计算机视觉 | 机器学习

2022-06-12

Gigapixel全斜面图像（WSIS）上的癌症预后一直是一项艰巨的任务。大多数现有方法仅着眼于单分辨率图像。利用图像金字塔增强WSI视觉表示的多分辨率方案尚未得到足够的关注。为了探索用于提高癌症预后准确性的多分辨率解决方案，本文提出了双流构建结构，以通过图像金字塔策略对WSI进行建模。该体系结构由两个子流组成：一个是用于低分辨率WSIS，另一个是针对高分辨率的WSIS。与其他方法相比，我们的方案具有三个亮点：（i）流和分辨率之间存在一对一的关系；（ii）添加了一个平方池层以对齐两个分辨率流的斑块，从而大大降低了计算成本并启用自然流特征融合；（iii）提出了一种基于跨注意的方法，以在低分辨率的指导下在空间上在空间上进行高分辨率斑块。我们验证了三个公共可用数据集的计划，来自1,911名患者的总数为3,101个WSI。实验结果验证（1）层次双流表示比单流的癌症预后更有效，在单个低分辨率和高分辨率流中，平均C-指数上升为5.0％和1.8％ ; （2）我们的双流方案可以胜过当前最新方案，而C-Index的平均平均值为5.1％；（3）具有可观察到的生存差异的癌症疾病可能对模型复杂性具有不同的偏好。我们的计划可以作为进一步促进WSI预后研究的替代工具。

translated by 谷歌翻译

Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

Daniel Reisenbüchler , Sophia J. Wagner , Melanie Boxberg , Tingying Peng

分类：计算机视觉 | 机器学习

2022-05-13

经典的多个实例学习（MIL）方法通常基于实例之间的相同和独立的分布式假设，因此忽略了个人实体以外的潜在丰富的上下文信息。另一方面，已经提出了具有全球自我发场模块的变压器来对所有实例之间的相互依赖性进行建模。但是，在本文中，我们质疑：是否需要使用自我注意力进行全球关系建模，或者我们是否可以适当地将自我注意计算限制为大规模整个幻灯片图像（WSIS）中的本地制度？我们为MIL（LA-MIL）提出了一个通用的基于局部注意力图的变压器，通过在自适应局部任意大小的自适应局部方案中明确化情境化实例，从而引入了归纳偏见。此外，有效适应的损失函数使我们可以学习表达性WSI嵌入的方法，以进行多种生物标志物的联合分析。我们证明，LA-MIL实现了最新的胃肠癌预测，从而超过了重要生物标志物（例如微卫星不稳定性的结直肠癌）的现有模型。我们的发现表明，本地自我注意力足够模型与全球模块相同的依赖性。我们的LA-MIL实施可从https://github.com/agentdr1/la_mil获得。

translated by 谷歌翻译

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

Zhuchen Shao , Hao Bian , Yang Chen , Yifeng Wang , Jian Zhang , Xiangyang Ji , Yongbing Zhang

分类：计算机视觉

2021-06-02

多实例学习（MIL）是一种强大的工具，可以解决基于整个滑动图像（WSI）的病理学诊断中的弱监督分类。然而，目前的MIL方法通常基于独立和相同的分布假设，从而忽略不同实例之间的相关性。为了解决这个问题，我们提出了一个被称为相关的MIL的新框架，并提供了融合证明。基于此框架，我们设计了一种基于变压器的MIL（TMARMIL），其探讨了形态和空间信息。所提出的传输可以有效地处理不平衡/平衡和二元/多重分类，具有良好的可视化和可解释性。我们对三种不同的计算病理问题进行了各种实验，与最先进的方法相比，实现了更好的性能和更快的会聚。在CAMELYON16数据集中的二进制肿瘤分类的测试AUC最高可达93.09％。在TCGA-NSCLC数据集和TCGA-RCC数据集中，癌症亚型分类的AUC分别可以高达96.03％和98.82％。实现可用于：https://github.com/szc19990412/transmil。

translated by 谷歌翻译

Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images

Kevin Thandiackal , Boqi Chen , Pushpak Pati , Guillaume Jaume , Drew F. K. Williamson , Maria Gabrani , Orcun Goksel

分类：计算机视觉

2022-04-26

多个实例学习（MIL）方法在数字病理学中对GIGA像素大小的全型图像（WSI）进行分类变得越来越流行。大多数MIL方法通过处理所有组织斑块，以单个WSI放大倍率运行。这样的公式诱导了高计算要求，并将WSI级表示的上下文化限制为单个量表。一些MIL方法扩展到多个量表，但在计算上要求更高。在本文中，受病理诊断过程的启发，我们提出了Zoommil，该方法学会了以端到端的方式执行多层缩放。Zoommil通过从多个增强元中汇总组织信息来构建WSI表示。所提出的方法在两个大数据集上的WSI分类中优于最先进的MIL方法，同时大大降低了关于浮点操作（FLOPS）和处理时间的计算需求，最高为40倍。

translated by 谷歌翻译

UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation

Xin Yu , Qi Yang , Yinchi Zhou , Leon Y. Cai , Riqiang Gao , Ho Hin Lee , Thomas Li , Shunxing Bao , Zhoubing Xu , Thomas A. Lasko

分类：计算机视觉

2022-09-28

Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis. Transformer reformats the image into separate patches and realize global communication via the self-attention mechanism. However, positional information between patches is hard to preserve in such 1D sequences, and loss of it can lead to sub-optimal performance when dealing with large amounts of heterogeneous tissues of various sizes in 3D medical image segmentation. Additionally, current methods are not robust and efficient for heavy-duty medical segmentation tasks such as predicting a large number of tissue classes or modeling globally inter-connected tissues structures. Inspired by the nested hierarchical structures in vision transformer, we proposed a novel 3D medical image segmentation method (UNesT), employing a simplified and faster-converging transformer encoder design that achieves local communication among spatially adjacent patch sequences by aggregating them hierarchically. We extensively validate our method on multiple challenging datasets, consisting anatomies of 133 structures in brain, 14 organs in abdomen, 4 hierarchical components in kidney, and inter-connected kidney tumors). We show that UNesT consistently achieves state-of-the-art performance and evaluate its generalizability and data efficiency. Particularly, the model achieves whole brain segmentation task complete ROI with 133 tissue classes in single network, outperforms prior state-of-the-art method SLANT27 ensembled with 27 network tiles, our model performance increases the mean DSC score of the publicly available Colin and CANDI dataset from 0.7264 to 0.7444 and from 0.6968 to 0.7025, respectively.

translated by 谷歌翻译

EGFR Mutation Prediction of Lung Biopsy Images using Deep Learning

Ravi Kant Gupta , Shivani Nandgaonkar , Nikhil Cherian Kurian , Swapnil Rane , Amit Sethi

分类：计算机视觉 | 人工智能 | 机器学习

2022-08-26

肺癌治疗中有针对性疗法的标准诊断程序涉及组织学亚型和随后检测关键驱动因素突变，例如EGFR。即使分子分析可以发现驱动器突变，但该过程通常很昂贵且耗时。深度学习的图像分析为直接从整个幻灯片图像（WSIS）直接发现驱动器突变提供了一种更经济的替代方法。在这项工作中，我们使用具有弱监督的自定义深度学习管道来鉴定苏木精和曙红染色的WSI的EGFR突变的形态相关性，此外还可以检测到肿瘤和组织学亚型。我们通过对两个肺癌数据集进行严格的实验和消融研究来证明管道的有效性-TCGA和来自印度的私人数据集。通过管道，我们在肿瘤检测下达到了曲线（AUC）的平均面积（AUC），在TCGA数据集上的腺癌和鳞状细胞癌之间的组织学亚型为0.942。对于EGFR检测，我们在TCGA数据集上的平均AUC为0.864，印度数据集的平均AUC为0.783。我们的关键学习点包括以下内容。首先，如果要在目标数据集中微调特征提取器，则使用对组织学训练的特征提取器层没有特别的优势。其次，选择具有较高细胞的斑块，大概是捕获肿瘤区域，并不总是有帮助的，因为疾病类别的迹象可能存在于肿瘤 - 肿瘤的基质中。

translated by 谷歌翻译

Multiplex-detection Based Multiple Instance Learning Network for Whole Slide Image Classification

Zhikang Wang , Yue Bi , Tong Pan , Chris Bain , Richard Bassed , Seiya Imoto , Jianhua Yao , Jiangning Song

分类：计算机视觉

2022-08-06

多个实例学习（MIL）是对诊断病理学的整个幻灯片图像（WSI）进行分类的强大方法。 MIL对WSI分类的基本挑战是发现触发袋子标签的\ textit {critical Instances}。但是，先前的方法主要是在独立和相同的分布假设（\ textit {i.i.d}）下设计的，忽略了肿瘤实例或异质性之间的相关性。在本文中，我们提出了一种新颖的基于多重检测的多重实例学习（MDMIL）来解决上述问题。具体而言，MDMIL是由内部查询产生模块（IQGM）和多重检测模块（MDM）构建的，并在训练过程中基于内存的对比度损失的辅助。首先，IQGM给出了实例的概率，并通过在分布分析后汇总高度可靠的功能来为后续MDM生成内部查询（IQ）。其次，在MDM中，多重检测交叉注意（MDCA）和多头自我注意力（MHSA）合作以生成WSI的最终表示形式。在此过程中，智商和可训练的变异查询（VQ）成功建立了实例之间的联系，并显着提高了模型对异质肿瘤的鲁棒性。最后，为了进一步在特征空间中实施限制并稳定训练过程，我们采用基于内存的对比损失，即使在每次迭代中有一个样本作为输入，也可以实现WSI分类。我们对三个计算病理数据集进行实验，例如CamelyOn16，TCGA-NSCLC和TCGA-RCC数据集。优越的准确性和AUC证明了我们提出的MDMIL比其他最先进方法的优越性。

translated by 谷歌翻译

Feature Re-calibration based MIL for Whole Slide Image Classification

Philip Chikontwe , Soo Jeong Nam , Heounjeong Go , Meejeong Kim , Hyun Jung Sung , Sang Hyun Park

分类：计算机视觉

2022-06-22

整个幻灯片图像（WSI）分类是诊断和治疗疾病的基本任务；但是，精确标签的策划是耗时的，并限制了完全监督的方法的应用。为了解决这个问题，多个实例学习（MIL）是一种流行的方法，它仅使用幻灯片级标签作为一个弱监督的学习任务。尽管当前的MIL方法将注意机制的变体应用于具有更强模型的重量实例特征，但注意力不足是对数据分布的属性的不足。在这项工作中，我们建议通过使用Max-Instance（关键）功能的统计数据来重新校准WSI袋（实例）的分布。我们假设在二进制MIL中，正面袋的特征幅度大于负面，因此我们可以强制执行该模型，以最大程度地利用公制特征损失的袋子之间的差异，该袋子将正面袋模型为未分布。为了实现这一目标，与使用单批训练模式的现有MIL方法不同，我们建议平衡批次采样以有效地使用功能丢失，即同时（+/-）袋子。此外，我们采用编码模块（PEM）的位置来建模空间/形态信息，并通过变压器编码器通过多头自我注意（PSMA）进行汇总。现有基准数据集的实验结果表明我们的方法是有效的，并且对最先进的MIL方法有所改善。

translated by 谷歌翻译

HEROHE Challenge: assessing HER2 status in breast cancer without immunohistochemistry or in situ hybridization

Eduardo Conde-Sousa , João Vale , Ming Feng , Kele Xu , Yin Wang , Vincenzo Della Mea , David La Barbera , Ehsan Montahaei , Mahdieh Soleymani Baghshah , Andreas Turzynski

分类：计算机视觉

2021-11-08

乳腺癌是女性最常见的恶性肿瘤，每年负责超过50万人死亡。因此，早期和准确的诊断至关重要。人类专业知识是诊断和正确分类乳腺癌并定义适当的治疗，这取决于评价不同生物标志物如跨膜蛋白受体HER2的表达。该评估需要几个步骤，包括免疫组织化学或原位杂交等特殊技术，以评估HER2状态。通过降低诊断中的步骤和人类偏差的次数的目标，赫洛挑战是组织的，作为第16届欧洲数字病理大会的并行事件，旨在自动化仅基于苏木精和曙红染色的HER2地位的评估侵袭性乳腺癌的组织样本。评估HER2状态的方法是在全球21个团队中提出的，并通过一些提议的方法实现了潜在的观点，以推进最先进的。

translated by 谷歌翻译

Embracing Annotation Efficient Learning (AEL) for Digital Pathology and Natural Images

Eu Wern Teh

分类：计算机视觉

2022-12-01

Jitendra Malik once said, "Supervision is the opium of the AI researcher". Most deep learning techniques heavily rely on extreme amounts of human labels to work effectively. In today's world, the rate of data creation greatly surpasses the rate of data annotation. Full reliance on human annotations is just a temporary means to solve current closed problems in AI. In reality, only a tiny fraction of data is annotated. Annotation Efficient Learning (AEL) is a study of algorithms to train models effectively with fewer annotations. To thrive in AEL environments, we need deep learning techniques that rely less on manual annotations (e.g., image, bounding-box, and per-pixel labels), but learn useful information from unlabeled data. In this thesis, we explore five different techniques for handling AEL.

translated by 谷歌翻译

A patch-based architecture for multi-label classification from single label annotations

Warren Jouanneau , Aurélie Bugeau , Marc Palyart , Nicolas Papadakis , Laurent Vézard

分类：计算机视觉

2022-09-14

在本文中，我们提出了一种基于补丁的体系结构，用于多标签分类问题，其中仅在数据集图像中观察到一个正面标签。我们的贡献是双重的。首先，我们根据注意机制介绍了一个轻斑架构。接下来，利用嵌入自相似性的补丁，我们提供了一种新颖的策略来估计负面示例并处理积极和未标记的学习问题。实验表明，我们的体系结构可以从头开始训练，而在文献中相关方法需要进行类似数据库的预培训。

translated by 谷歌翻译

Multi-Scale Relational Graph Convolutional Network for Multiple Instance Learning in Histopathology Images

Roozbeh Bazargani , Ladan Fazli , Larry Goldenberg , Martin Gleave , Ali Bashashati , Septimiu Salcudean

分类：计算机视觉

2022-12-17

Graph convolutional neural networks have shown significant potential in natural and histopathology images. However, their use has only been studied in a single magnification or multi-magnification with late fusion. In order to leverage the multi-magnification information and early fusion with graph convolutional networks, we handle different embedding spaces at each magnification by introducing the Multi-Scale Relational Graph Convolutional Network (MS-RGCN) as a multiple instance learning method. We model histopathology image patches and their relation with neighboring patches and patches at other scales (i.e., magnifications) as a graph. To pass the information between different magnification embedding spaces, we define separate message-passing neural networks based on the node and edge type. We experiment on prostate cancer histopathology images to predict the grade groups based on the extracted features from patches. We also compare our MS-RGCN with multiple state-of-the-art methods with evaluations on both source and held-out datasets. Our method outperforms the state-of-the-art on both datasets and especially on the classification of grade groups 2 and 3, which are significant for clinical decisions for patient management. Through an ablation study, we test and show the value of the pertinent design features of the MS-RGCN.

translated by 谷歌翻译

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy , Lucas Beyer , Alexander Kolesnikov , Dirk Weissenborn , Xiaohua Zhai , Thomas Unterthiner , Mostafa Dehghani , Matthias Minderer , Georg Heigold , Sylvain Gelly

分类：

2020-10-22

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train. 1

translated by 谷歌翻译