智能论文笔记

Bi-Temporal Semantic Reasoning for the Semantic Change Detection in HR Remote Sensing Images

Lei Ding , Haitao Guo , Sicong Liu , Lichao Mou , Jing Zhang , Lorenzo Bruzzone

分类：计算机视觉

2021-08-13

语义变化检测（SCD）扩展了多级变化检测（MCD）任务，不仅提供了更改位置，而且提供了观察间隔之前和之后的详细覆盖/土地使用（LCLU）类别。这种细粒度的语义变更信息在许多应用中非常有用。最近的研究表明，SCD可以通过三分支卷积神经网络（CNN）进行建模，其包含两个时间分支和变化分支。然而，在这种架构中，时间分支和改变分支之间的通信不足。为了克服现有方法中的限制，我们提出了一种用于SCD的新型CNN架构，其中语义时间特征在深CD单元中合并。此外，我们详细说明了这种架构，以推理双颞态语义相关性。由此产生的双时话语义推理网络（BI-SRNET）包含两种类型的语义推理块，以推理单时段和跨时话语义相关性，以及提高改变变化检测结果的语义一致性的新型损失功能。基准数据集上的实验结果表明，该架构对现有方法获得了显着的准确性改进，而Bi-SRNET中的添加设计则进一步提高了语义类别和改变区域的分割。本文的代码可访问：github.com/gnsding/bi-srnet。

translated by 谷歌翻译

Joint Spatio-Temporal Modeling for Semantic Change Detection in Remote Sensing Images

Lei Ding , Jing Zhang , Kai Zhang , Haitao Guo , Bing Liu , Lorenzo Bruzzone

分类：计算机视觉

2022-12-10

Semantic Change Detection (SCD) refers to the task of simultaneously extracting the changed areas and the semantic categories (before and after the changes) in Remote Sensing Images (RSIs). This is more meaningful than Binary Change Detection (BCD) since it enables detailed change analysis in the observed areas. Previous works established triple-branch Convolutional Neural Network (CNN) architectures as the paradigm for SCD. However, it remains challenging to exploit semantic information with a limited amount of change samples. In this work, we investigate to jointly consider the spatio-temporal dependencies to improve the accuracy of SCD. First, we propose a SCanFormer (Semantic Change Transformer) to explicitly model the 'from-to' semantic transitions between the bi-temporal RSIs. Then, we introduce a semantic learning scheme to leverage the spatio-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network (ScanNet) significantly outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bi-temporal results. It achieves the SOTA accuracy on two benchmark datasets for the SCD.

translated by 谷歌翻译

Adversarial Shape Learning for Building Extraction in VHR Remote Sensing Images

Lei Ding , Hao Tang , Yahui Liu , Yilei Shi , Xiao Xiang Zhu , Lorenzo Bruzzone

分类：计算机视觉

2021-02-22

由于遮挡和边界歧义问题，VHR RSIS中的建筑提取仍然是一个具有挑战性的任务。虽然传统的卷积神经网络（CNNS）的方法能够利用局部纹理和上下文信息，但它们无法捕获建筑物的形状模式，这是人类识别的必要约束。为了解决这个问题，我们提出了一个对抗性形状学习网络（ASLNET）来模拟构建形状模式，提高构建分割的准确性。在提议的ASLNET中，我们介绍了对抗性学习策略，以明确地模拟形状约束，以及CNN形式规范器，以加强形状特征的嵌入。为了评估构建分割结果的几何准确性，我们介绍了几种基于对象的质量评估度量。两个开放基准数据集的实验表明，所提出的ASLNET通过大边距提高了基于像素的准确性和基于对象的质量测量。该代码可在：https://github.com/gnsding/aslnet

translated by 谷歌翻译

DAHiTrA: Damage Assessment Using a Novel Hierarchical Transformer Architecture

Navjot Kaur , Cheng-Chun Lee , Ali Mostafavi , Ali Mahdavi-Amiri

分类：计算机视觉

2022-08-03

本文介绍了Dahitra，这是一种具有分层变压器的新型深度学习模型，可在飓风后根据卫星图像对建筑物的损害进行分类。自动化的建筑损害评估为决策和资源分配提供了关键信息，以快速应急响应。卫星图像提供了实时，高覆盖的信息，并提供了向大规模污点后建筑物损失评估提供信息的机会。此外，深入学习方法已证明在对建筑物的损害进行分类方面有希望。在这项工作中，提出了一个基于变压器的新型网络来评估建筑物的损失。该网络利用多个分辨率的层次空间特征，并在将变压器编码器应用于空间特征后捕获特征域的时间差异。当对大规模灾难损坏数据集（XBD）进行测试以构建本地化和损坏分类以及在Levir-CD数据集上进行更改检测任务时，该网络将实现最先进的绩效。此外，我们引入了一个新的高分辨率卫星图像数据集，IDA-BD（与2021年路易斯安那州的2021年飓风IDA有关，以便域名适应以进一步评估该模型的能力，以适用于新损坏的区域。域的适应结果表明，所提出的模型可以适应一个新事件，只有有限的微调。因此，所提出的模型通过更好的性能和域的适应来推进艺术的当前状态。此外，IDA-BD也提供了A高分辨率注释的数据集用于该领域的未来研究。

translated by 谷歌翻译

TINYCD: A (Not So) Deep Learning Model For Change Detection

Andrea Codegoni , Gabriele Lombardi , Alessandro Ferrari

分类：计算机视觉 | 机器学习

2022-07-26

更改检测的目的（CD）是通过比较在不同时间拍摄的两张图像来检测变化。 CD的挑战性部分是跟踪用户想要突出显示的变化，例如新建筑物，并忽略了由于外部因素（例如环境，照明条件，雾或季节性变化）而引起的变化。深度学习领域的最新发展使研究人员能够在这一领域取得出色的表现。特别是，时空注意的不同机制允许利用从模型中提取的空间特征，并通过利用这两个可用图像来以时间方式将它们相关联。不利的一面是，这些模型已经变得越来越复杂且大，对于边缘应用来说通常是不可行的。当必须将模型应用于工业领域或需要实时性能的应用程序时，这些都是限制。在这项工作中，我们提出了一个名为TinyCD的新型模型，证明既轻量级又有效，能够实现较少参数13-150x的最新技术状态。在我们的方法中，我们利用了低级功能比较图像的重要性。为此，我们仅使用几个骨干块。此策略使我们能够保持网络参数的数量较低。为了构成从这两个图像中提取的特征，我们在参数方面引入了一种新颖的经济性，混合块能够在时空和时域中交叉相关的特征。最后，为了充分利用计算功能中包含的信息，我们定义了能够执行像素明智分类的PW-MLP块。源代码，模型和结果可在此处找到：https：//github.com/andreacodegoni/tiny_model_4_cd

translated by 谷歌翻译

Efficient Semantic Segmentation on Edge Devices

Farshad Safavi , Irfan Ali , Venkatesh Dasari , Guanqun Song , Ting Zhu

分类：计算机视觉 | 机器学习

2022-12-28

Semantic segmentation works on the computer vision algorithm for assigning each pixel of an image into a class. The task of semantic segmentation should be performed with both accuracy and efficiency. Most of the existing deep FCNs yield to heavy computations and these networks are very power hungry, unsuitable for real-time applications on portable devices. This project analyzes current semantic segmentation models to explore the feasibility of applying these models for emergency response during catastrophic events. We compare the performance of real-time semantic segmentation models with non-real-time counterparts constrained by aerial images under oppositional settings. Furthermore, we train several models on the Flood-Net dataset, containing UAV images captured after Hurricane Harvey, and benchmark their execution on special classes such as flooded buildings vs. non-flooded buildings or flooded roads vs. non-flooded roads. In this project, we developed a real-time UNet based model and deployed that network on Jetson AGX Xavier module.

translated by 谷歌翻译

Transformers in Remote Sensing: A Survey

Abdulaziz Amer Aleissaee , Amandeep Kumar , Rao Muhammad Anwer , Salman Khan , Hisham Cholakkal , Gui-Song Xia , Fahad Shahbaz khan

分类：计算机视觉

2022-09-02

在过去的十年中，基于深度学习的算法在遥感图像分析的不同领域中广泛流行。最近，最初在自然语言处理中引入的基于变形金刚的体系结构遍布计算机视觉领域，在该字段中，自我发挥的机制已被用作替代流行的卷积操作员来捕获长期依赖性。受到计算机视觉的最新进展的启发，遥感社区还见证了对各种任务的视觉变压器的探索。尽管许多调查都集中在计算机视觉中的变压器上，但据我们所知，我们是第一个对基于遥感中变压器的最新进展进行系统评价的人。我们的调查涵盖了60多种基于变形金刚的60多种方法，用于遥感子方面的不同遥感问题：非常高分辨率（VHR），高光谱（HSI）和合成孔径雷达（SAR）图像。我们通过讨论遥感中变压器的不同挑战和开放问题来结束调查。此外，我们打算在遥感论文中频繁更新和维护最新的变压器，及其各自的代码：https：//github.com/virobo-15/transformer-in-in-remote-sensing

translated by 谷歌翻译

HTML版本

RCDT: Relational Remote Sensing Change Detection with Transformer

Kaixuan Lu , Xiao Huang

分类：计算机视觉

2022-12-09

Deep learning based change detection methods have received wide attentoion, thanks to their strong capability in obtaining rich features from images. However, existing AI-based CD methods largely rely on three functionality-enhancing modules, i.e., semantic enhancement, attention mechanisms, and correspondence enhancement. The stacking of these modules leads to great model complexity. To unify these three modules into a simple pipeline, we introduce Relational Change Detection Transformer (RCDT), a novel and simple framework for remote sensing change detection tasks. The proposed RCDT consists of three major components, a weight-sharing Siamese Backbone to obtain bi-temporal features, a Relational Cross Attention Module (RCAM) that implements offset cross attention to obtain bi-temporal relation-aware features, and a Features Constrain Module (FCM) to achieve the final refined predictions with high-resolution constraints. Extensive experiments on four different publically available datasets suggest that our proposed RCDT exhibits superior change detection performance compared with other competing methods. The therotical, methodogical, and experimental knowledge of this study is expected to benefit future change detection efforts that involve the cross attention mechanism.

translated by 谷歌翻译

Learning a Joint Embedding of Multiple Satellite Sensors: A Case Study for Lake Ice Monitoring

Manu Tom , Yuchang Jiang , Emmanuel Baltsavias , Konrad Schindler

分类：计算机视觉

2021-07-19

Fusing satellite imagery acquired with different sensors has been a long-standing challenge of Earth observation, particularly across different modalities such as optical and Synthetic Aperture Radar (SAR) images. Here, we explore the joint analysis of imagery from different sensors in the light of representation learning: we propose to learn a joint embedding of multiple satellite sensors within a deep neural network. Our application problem is the monitoring of lake ice on Alpine lakes. To reach the temporal resolution requirement of the Swiss Global Climate Observing System (GCOS) office, we combine three image sources: Sentinel-1 SAR (S1-SAR), Terra MODIS, and Suomi-NPP VIIRS. The large gaps between the optical and SAR domains and between the sensor resolutions make this a challenging instance of the sensor fusion problem. Our approach can be classified as a late fusion that is learned in a data-driven manner. The proposed network architecture has separate encoding branches for each image sensor, which feed into a single latent embedding. I.e., a common feature representation shared by all inputs, such that subsequent processing steps deliver comparable output irrespective of which sort of input image was used. By fusing satellite data, we map lake ice at a temporal resolution of < 1.5 days. The network produces spatially explicit lake ice maps with pixel-wise accuracies > 91% (respectively, mIoU scores > 60%) and generalises well across different lakes and winters. Moreover, it sets a new state-of-the-art for determining the important ice-on and ice-off dates for the target lakes, in many cases meeting the GCOS requirement.

translated by 谷歌翻译

dual unet:a novel siamese network for change detection with cascade differential fusion

Kaixuan Jiang , Ja Liu , Fang Liu , Wenhua Zhang , Yangguang Liu

分类：计算机视觉

2022-08-12

遥感图像的更改检测（CD）是通过分析两个次时图像之间的差异来检测变化区域。它广泛用于土地资源规划，自然危害监测和其他领域。在我们的研究中，我们提出了一个新型的暹罗神经网络，用于变化检测任务，即双UNET。与以前的单独编码BITEMAL图像相反，我们设计了一个编码器差分注意模块，以关注像素的空间差异关系。为了改善网络的概括，它计算了咬合图像之间的任何像素之间的注意力权重，并使用它们来引起更具区别的特征。为了改善特征融合并避免梯度消失，在解码阶段提出了多尺度加权方差图融合策略。实验表明，所提出的方法始终优于流行的季节性变化检测数据集最先进的方法。

translated by 谷歌翻译

Unsupervised Flood Detection on SAR Time Series

Ritu Yadav , Andrea Nascetti , Hossein Azizpour , Yifang Ban

分类：计算机视觉

2022-12-07

Human civilization has an increasingly powerful influence on the earth system. Affected by climate change and land-use change, natural disasters such as flooding have been increasing in recent years. Earth observations are an invaluable source for assessing and mitigating negative impacts. Detecting changes from Earth observation data is one way to monitor the possible impact. Effective and reliable Change Detection (CD) methods can help in identifying the risk of disaster events at an early stage. In this work, we propose a novel unsupervised CD method on time series Synthetic Aperture Radar~(SAR) data. Our proposed method is a probabilistic model trained with unsupervised learning techniques, reconstruction, and contrastive learning. The change map is generated with the help of the distribution difference between pre-incident and post-incident data. Our proposed CD model is evaluated on flood detection data. We verified the efficacy of our model on 8 different flood sites, including three recent flood events from Copernicus Emergency Management Services and six from the Sen1Floods11 dataset. Our proposed model achieved an average of 64.53\% Intersection Over Union(IoU) value and 75.43\% F1 score. Our achieved IoU score is approximately 6-27\% and F1 score is approximately 7-22\% better than the compared unsupervised and supervised existing CD methods. The results and extensive discussion presented in the study show the effectiveness of the proposed unsupervised CD method.

translated by 谷歌翻译

SiamixFormer: A Siamese Transformer Network For Building Detection And Change Detection From Bi-Temporal Remote Sensing Images

Amir mohammadian , Foad Ghaderi

分类：计算机视觉

2022-08-01

使用遥感图像进行建筑检测和变更检测可以帮助城市和救援计划。此外，它们可用于自然灾害后的建筑损害评估。当前，大多数用于建筑物检测的现有模型仅使用一个图像（预拆架图像）来检测建筑物。这是基于这样的想法：由于存在被破坏的建筑物，后沙仪图像降低了模型的性能。在本文中，我们提出了一种称为暹罗形式的暹罗模型，该模型使用前和垃圾后图像作为输入。我们的模型有两个编码器，并具有分层变压器体系结构。两个编码器中每个阶段的输出都以特征融合的方式给予特征融合，以从disasaster图像生成查询，并且（键，值）是从disasaster图像中生成的。为此，在特征融合中也考虑了时间特征。在特征融合中使用颞变压器的另一个优点是，与CNN相比，它们可以更好地维持由变压器编码器产生的大型接受场。最后，在每个阶段，将颞变压器的输出输入简单的MLP解码器。在XBD和WHU数据集上评估了暹罗形式模型，用于构建检测以及Levir-CD和CDD数据集，以进行更改检测，并可以胜过最新的。

translated by 谷歌翻译

MKANet: A Lightweight Network with Sobel Boundary Loss for Efficient Land-cover Classification of Satellite Remote Sensing Imagery

Zhiqi Zhang , Wen Lu , Jinshan Cao , Guangqi Xie

分类：计算机视觉

2022-07-28

土地覆盖分类是一项多级分割任务，将每个像素分类为地球表面的某些天然或人为类别，例如水，土壤，自然植被，农作物和人类基础设施。受硬件计算资源和内存能力的限制，大多数现有研究通过将它们放置或将其裁剪成小于512*512像素的小斑块来预处理原始遥感图像，然后再将它们发送到深神经网络。然而，下调图像会导致空间细节损失，使小细分市场难以区分，并逆转了数十年来努力获得的空间分辨率进度。将图像裁剪成小斑块会导致远程上下文信息的丢失，并将预测的结果恢复为原始大小会带来额外的延迟。为了响应上述弱点，我们提出了称为Mkanet的有效的轻巧的语义分割网络。 Mkanet针对顶视图高分辨率遥感图像的特征，利用共享内核同时且同样处理不一致的尺度的地面段，还采用平行且浅层的体系结构来提高推理速度和友好的支持速度和友好的支持图像贴片，超过10倍。为了增强边界和小段歧视，我们还提出了一种捕获类别杂质区域的方法，利用边界信息并对边界和小部分错误判断施加额外的惩罚。广泛实验的视觉解释和定量指标都表明，Mkanet在两个土地覆盖分类数据集上获得了最先进的准确性，并且比其他竞争性轻量级网络快2倍。所有这些优点突出了Mkanet在实际应用中的潜力。

translated by 谷歌翻译

RDP-Net: Region Detail Preserving Network for Change Detection

Hongjia Chen , Fangling Pu , Rui Yang , Rui Tang , Xin Xu

分类：计算机视觉

2022-02-20

Change detection (CD) is an essential earth observation technique. It captures the dynamic information of land objects. With the rise of deep learning, convolutional neural networks (CNN) have shown great potential in CD. However, current CNN models introduce backbone architectures that lose detailed information during learning. Moreover, current CNN models are heavy in parameters, which prevents their deployment on edge devices such as UAVs. In this work, we tackle this issue by proposing RDP-Net: a region detail preserving network for CD. We propose an efficient training strategy that constructs the training tasks during the warmup period of CNN training and lets the CNN learn from easy to hard. The training strategy enables CNN to learn more powerful features with fewer FLOPs and achieve better performance. Next, we propose an effective edge loss that increases the penalty for errors on details and improves the network's attention to details such as boundary regions and small areas. Furthermore, we provide a CNN model with a brand new backbone that achieves the state-of-the-art empirical performance in CD with only 1.70M parameters. We hope our RDP-Net would benefit the practical CD applications on compact devices and could inspire more people to bring change detection to a new level with the efficient training strategy. The code and models are publicly available at https://github.com/Chnja/RDPNet.

translated by 谷歌翻译

An End-to-end Supervised Domain Adaptation Framework for Cross-Domain Change Detection

Jia Liu , Wenjie Xuan , Yuhang Gan , Juhua Liu , Bo Du

分类：计算机视觉

2022-04-01

现有的基于深度学习的变更检测方法试图精心设计具有功能强大特征表示的复杂神经网络，但忽略了随时间变化的土地覆盖变化引起的通用域转移，包括亮度波动和事件前和事后图像之间的季节变化，从而产生亚最佳结果。在本文中，我们提出了一个端到端监督域的适应框架，用于跨域变更检测，即SDACD，以有效地减轻双期颞图像之间的域移位，以更好地变更预测。具体而言，我们的SDACD通过有监督的学习从图像和特征角度介绍了合作改编。图像适应性利用了具有循环矛盾的限制来利用生成的对抗学习，以执行跨域样式转换，从而有效地以两边的方式缩小了域间隙。为了特征适应性，我们提取域不变特征以对齐特征空间中的不同特征分布，这可以进一步减少跨域图像的域间隙。为了进一步提高性能，我们结合了三种类型的双颞图像，以进行最终变化预测，包括初始输入双期图像和两个来自事件前和事后域的生成的双颞图像。对两个基准的广泛实验和分析证明了我们提出的框架的有效性和普遍性。值得注意的是，我们的框架将几个代表性的基线模型推向了新的最先进的记录，分别在CDD和WHU建筑数据集上分别达到97.34％和92.36％。源代码和模型可在https://github.com/perfect-you/sdacd上公开获得。

translated by 谷歌翻译

SARAS-Net: Scale and Relation Aware Siamese Network for Change Detection

Chao-Peng Chen , Jun-Wei Hsieh , Ping-Yang Chen , Yi-Kuan Hsieh , Bor-Shiun Wang

分类：计算机视觉 | 人工智能

2022-12-02

Change detection (CD) aims to find the difference between two images at different times and outputs a change map to represent whether the region has changed or not. To achieve a better result in generating the change map, many State-of-The-Art (SoTA) methods design a deep learning model that has a powerful discriminative ability. However, these methods still get lower performance because they ignore spatial information and scaling changes between objects, giving rise to blurry or wrong boundaries. In addition to these, they also neglect the interactive information of two different images. To alleviate these problems, we propose our network, the Scale and Relation-Aware Siamese Network (SARAS-Net) to deal with this issue. In this paper, three modules are proposed that include relation-aware, scale-aware, and cross-transformer to tackle the problem of scene change detection more effectively. To verify our model, we tested three public datasets, including LEVIR-CD, WHU-CD, and DSFIN, and obtained SoTA accuracy. Our code is available at https://github.com/f64051041/SARAS-Net.

translated by 谷歌翻译

Medical Image Segmentation Using Deep Learning: A Survey

Risheng Wang , Tao Lei , Ruixia Cui , Bingtao Zhang , Hongying Meng , Asoke K. Nandi

分类：计算机视觉

2020-09-28

深度学习已被广泛用于医学图像分割，并且录制了录制了该领域深度学习的成功的大量论文。在本文中，我们使用深层学习技术对医学图像分割的全面主题调查。本文进行了两个原创贡献。首先，与传统调查相比，直接将深度学习的文献分成医学图像分割的文学，并为每组详细介绍了文献，我们根据从粗略到精细的多级结构分类目前流行的文献。其次，本文侧重于监督和弱监督的学习方法，而不包括无监督的方法，因为它们在许多旧调查中引入而且他们目前不受欢迎。对于监督学习方法，我们分析了三个方面的文献：骨干网络的选择，网络块的设计，以及损耗功能的改进。对于虚弱的学习方法，我们根据数据增强，转移学习和交互式分割进行调查文献。与现有调查相比，本调查将文献分类为比例不同，更方便读者了解相关理由，并将引导他们基于深度学习方法思考医学图像分割的适当改进。

translated by 谷歌翻译

S2Looking: A Satellite Side-Looking Dataset for Building Change Detection

Li Shen , Yao Lu , Hao Chen , Hao Wei , Donghai Xie , Jiabao Yue , Rui Chen , Shouye Lv , Bitao Jiang

分类：计算机视觉 | 人工智能

2021-07-20

建筑变更检测是许多重要应用，特别是在军事和危机管理领域。最近用于变化检测的方法已转向深度学习，这取决于其培训数据的质量。因此，大型注释卫星图像数据集的组装对于全球建筑更改监视是必不可少的。现有数据集几乎完全提供近Nadir观看角度。这限制了可以检测到的更改范围。通过提供更大的观察范围，光学卫星的滚动成像模式提出了克服这种限制的机会。因此，本文介绍了S2Looking，一个建筑变革检测数据集，其中包含以各种偏离Nadir角度捕获的大规模侧视卫星图像。 DataSet由5000个批次图像对组成的农村地区，并在全球范围内超过65,920个辅助的变化实例。数据集可用于培训基于深度学习的变更检测算法。它通过提供（1）更大的观察角来扩展现有数据集; （2）大照明差异; （3）额外的农村形象复杂性。为了便于{该数据集的使用，已经建立了基准任务，并且初步测试表明，深度学习算法发现数据集明显比最接近的近Nadir DataSet，Levir-CD +更具挑战性。因此，S2Looking可能会促进现有的建筑变革检测算法的重要进步。 DataSet可在https://github.com/s2looking/使用。

translated by 谷歌翻译

IDET: Iterative Difference-Enhanced Transformers for High-Quality Change Detection

Qing Guo , Ruofei Wang , Rui Huang , Shuifa Sun , Yuxiang Zhang

分类：计算机视觉

2022-07-15

Change detection (CD) aims to detect change regions within an image pair captured at different times, playing a significant role in diverse real-world applications. Nevertheless, most of the existing works focus on designing advanced network architectures to map the feature difference to the final change map while ignoring the influence of the quality of the feature difference. In this paper, we study the CD from a different perspective, i.e., how to optimize the feature difference to highlight changes and suppress unchanged regions, and propose a novel module denoted as iterative difference-enhanced transformers (IDET). IDET contains three transformers: two transformers for extracting the long-range information of the two images and one transformer for enhancing the feature difference. In contrast to the previous transformers, the third transformer takes the outputs of the first two transformers to guide the enhancement of the feature difference iteratively. To achieve more effective refinement, we further propose the multi-scale IDET-based change detection that uses multi-scale representations of the images for multiple feature difference refinements and proposes a coarse-to-fine fusion strategy to combine all refinements. Our final CD method outperforms seven state-of-the-art methods on six large-scale datasets under diverse application scenarios, which demonstrates the importance of feature difference enhancements and the effectiveness of IDET.

translated by 谷歌翻译

Image Segmentation Using Deep Learning: A Survey

Shervin Minaee , Yuri Boykov , Fatih Porikli , Antonio Plaza , Nasser Kehtarnavaz , Demetri Terzopoulos

分类：

2020-01-15

Image segmentation is a key topic in image processing and computer vision with applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among many others. Various algorithms for image segmentation have been developed in the literature. Recently, due to the success of deep learning models in a wide range of vision applications, there has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models. In this survey, we provide a comprehensive review of the literature at the time of this writing, covering a broad spectrum of pioneering works for semantic and instance-level segmentation, including fully convolutional pixel-labeling networks, encoder-decoder architectures, multi-scale and pyramid based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the similarity, strengths and challenges of these deep learning models, examine the most widely used datasets, report performances, and discuss promising future research directions in this area.

translated by 谷歌翻译