智能论文笔记

Detecting Severity of Diabetic Retinopathy from Fundus Images using Ensembled Transformers

Chandranath Adak , Tejas Karkera , Soumi Chattopadhyay , Muhammad Saqib

分类：计算机视觉 | 人工智能

2023-01-03

Diabetic Retinopathy (DR) is considered one of the primary concerns due to its effect on vision loss among most people with diabetes globally. The severity of DR is mostly comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this paper, we adopt transformer-based learning models to capture the crucial features of retinal images to understand DR severity better. We work with ensembling image transformers, where we adopt four models, namely ViT (Vision Transformer), BEiT (Bidirectional Encoder representation for image Transformer), CaiT (Class-Attention in Image Transformers), and DeiT (Data efficient image Transformers), to infer the degree of DR severity from fundus photographs. For experiments, we used the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.

translated by 谷歌翻译

当前，借助监督学习方法，基于深度学习的视觉检查已取得了非常成功的成功。但是，在实际的工业场景中，缺陷样本的稀缺性，注释的成本以及缺乏缺陷的先验知识可能会使基于监督的方法无效。近年来，无监督的异常定位算法已在工业检查任务中广泛使用。本文旨在通过深入学习在工业图像中无视无视的异常定位中的最新成就来帮助该领域的研究人员。该调查回顾了120多个重要出版物，其中涵盖了异常定位的各个方面，主要涵盖了所审查方法的各种概念，挑战，分类法，基准数据集和定量性能比较。在审查迄今为止的成就时，本文提供了一些未来研究方向的详细预测和分析。这篇综述为对工业异常本地化感兴趣的研究人员提供了详细的技术信息，并希望将其应用于其他领域的异常本质。

translated by 谷歌翻译

随着电子商务行业的扩散，分析客户反馈是服务提供商必不可少的。最近几天，可以注意到，客户以评论分数上传购买的产品图像。在本文中，我们承担了分析此类视觉评论的任务，这是非常新的。过去，研究人员致力于分析语言反馈，但是在这里，我们没有从语言评论中获得任何可能不存在的帮助，因为可以观察到最近的趋势，客户喜欢快速上传视觉反馈而不是输入语言反馈。我们提出了一个分层体系结构，高级模型参与产品分类，而低级模型则注意从客户提供的产品图像预测评论得分。我们通过采购真实的视觉产品评论来生成数据库，这非常具有挑战性。我们的体系结构通过对所采用的数据库进行广泛的实验，从而获得了一些有希望的结果。拟议的分层体系结构比单层最佳可比架构的性能提高了57.48％。

translated by 谷歌翻译

时间轴提供了最有效的方法之一，可以看到一段时间内发生的重要历史事实，从而呈现出从文本形式阅读等效信息的见解。通过利用生成的对抗性学习进行重要的句子分类，并通过吸收基于知识的标签来改善事件核心分辨率的性能，我们从多个（历史）文本文档中引入了两个分阶段的事件时间表生成的系统。我们在两个手动注释的历史文本文档上演示了我们的结果。我们的结果对历史学家，推进历史研究以及理解一个国家的社会政治格局的研究对历史学家来说非常有帮助。

translated by 谷歌翻译