超级分辨率(SR)是低级视觉区域的基本和代表任务。通常认为,从SR网络中提取的特征没有特定的语义信息,并且网络只能从输入到输出中学习复杂的非线性映射。我们可以在SR网络中找到任何“语义”吗?在本文中,我们为此问题提供了肯定的答案。通过分析具有维度降低和可视化的特征表示,我们成功地发现了SR网络中的深度语义表示,\ Texit {i.},深度劣化表示(DDR),其与图像劣化类型和度数相关。我们还揭示了分类和SR网络之间的表示语义的差异。通过广泛的实验和分析,我们得出一系列观测和结论,对未来的工作具有重要意义,例如解释低级CNN网络的内在机制以及开发盲人SR的新评估方法。
translated by 谷歌翻译
Face Restoration (FR) aims to restore High-Quality (HQ) faces from Low-Quality (LQ) input images, which is a domain-specific image restoration problem in the low-level computer vision area. The early face restoration methods mainly use statistic priors and degradation models, which are difficult to meet the requirements of real-world applications in practice. In recent years, face restoration has witnessed great progress after stepping into the deep learning era. However, there are few works to study deep learning-based face restoration methods systematically. Thus, this paper comprehensively surveys recent advances in deep learning techniques for face restoration. Specifically, we first summarize different problem formulations and analyze the characteristic of the face image. Second, we discuss the challenges of face restoration. Concerning these challenges, we present a comprehensive review of existing FR methods, including prior based methods and deep learning-based methods. Then, we explore developed techniques in the task of FR covering network architectures, loss functions, and benchmark datasets. We also conduct a systematic benchmark evaluation on representative methods. Finally, we discuss future directions, including network designs, metrics, benchmark datasets, applications,etc. We also provide an open-source repository for all the discussed methods, which is available at https://github.com/TaoWangzj/Awesome-Face-Restoration.
translated by 谷歌翻译
面部超分辨率(FSR),也称为面部幻觉,其旨在增强低分辨率(LR)面部图像以产生高分辨率(HR)面部图像的分辨率,是特定于域的图像超分辨率问题。最近,FSR获得了相当大的关注,并目睹了深度学习技术的发展炫目。迄今为止,有很少有基于深入学习的FSR的研究摘要。在本次调查中,我们以系统的方式对基于深度学习的FSR方法进行了全面审查。首先,我们总结了FSR的问题制定,并引入了流行的评估度量和损失功能。其次,我们详细说明了FSR中使用的面部特征和流行数据集。第三,我们根据面部特征的利用大致分类了现有方法。在每个类别中,我们从设计原则的一般描述开始,然后概述代表方法,然后讨论其中的利弊。第四,我们评估了一些最先进的方法的表现。第五,联合FSR和其他任务以及与FSR相关的申请大致介绍。最后,我们设想了这一领域进一步的技术进步的前景。在\ URL {https://github.com/junjun-jiang/face-hallucination-benchmark}上有一个策划的文件和资源的策划文件和资源清单
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
The Super-Resolution Generative Adversarial Network (SR-GAN) [1] is a seminal work that is capable of generating realistic textures during single image super-resolution. However, the hallucinated details are often accompanied with unpleasant artifacts. To further enhance the visual quality, we thoroughly study three key components of SRGANnetwork architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN). In particular, we introduce the Residual-in-Residual Dense Block (RRDB) without batch normalization as the basic network building unit. Moreover, we borrow the idea from relativistic GAN [2] to let the discriminator predict relative realness instead of the absolute value. Finally, we improve the perceptual loss by using the features before activation, which could provide stronger supervision for brightness consistency and texture recovery. Benefiting from these improvements, the proposed ESRGAN achieves consistently better visual quality with more realistic and natural textures than SRGAN and won the first place in the PIRM2018-SR Challenge 1 [3]. The code is available at https://github.com/xinntao/ESRGAN.
translated by 谷歌翻译
Deep Convolutional Neural Networks (DCNNs) have exhibited impressive performance on image super-resolution tasks. However, these deep learning-based super-resolution methods perform poorly in real-world super-resolution tasks, where the paired high-resolution and low-resolution images are unavailable and the low-resolution images are degraded by complicated and unknown kernels. To break these limitations, we propose the Unsupervised Bi-directional Cycle Domain Transfer Learning-based Generative Adversarial Network (UBCDTL-GAN), which consists of an Unsupervised Bi-directional Cycle Domain Transfer Network (UBCDTN) and the Semantic Encoder guided Super Resolution Network (SESRN). First, the UBCDTN is able to produce an approximated real-like LR image through transferring the LR image from an artificially degraded domain to the real-world LR image domain. Second, the SESRN has the ability to super-resolve the approximated real-like LR image to a photo-realistic HR image. Extensive experiments on unpaired real-world image benchmark datasets demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods.
translated by 谷歌翻译
Convolutional Neural Network (CNN)-based image super-resolution (SR) has exhibited impressive success on known degraded low-resolution (LR) images. However, this type of approach is hard to hold its performance in practical scenarios when the degradation process is unknown. Despite existing blind SR methods proposed to solve this problem using blur kernel estimation, the perceptual quality and reconstruction accuracy are still unsatisfactory. In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model. We propose a components decomposition and co-optimization network (CDCN) for blind SR. Firstly, CDCN decomposes the input LR image into structure and detail components in feature space. Then, the mutual collaboration block (MCB) is presented to exploit the relationship between both two components. In this way, the detail component can provide informative features to enrich the structural context and the structure component can carry structural context for better detail revealing via a mutual complementary manner. After that, we present a degradation-driven learning strategy to jointly supervise the HR image detail and structure restoration process. Finally, a multi-scale fusion module followed by an upsampling layer is designed to fuse the structure and detail features and perform SR reconstruction. Empowered by such degradation-based components decomposition, collaboration, and mutual optimization, we can bridge the correlation between component learning and degradation modelling for blind SR, thereby producing SR results with more accurate textures. Extensive experiments on both synthetic SR datasets and real-world images show that the proposed method achieves the state-of-the-art performance compared to existing methods.
translated by 谷歌翻译
Existing convolutional neural networks (CNN) based image super-resolution (SR) methods have achieved impressive performance on bicubic kernel, which is not valid to handle unknown degradations in real-world applications. Recent blind SR methods suggest to reconstruct SR images relying on blur kernel estimation. However, their results still remain visible artifacts and detail distortion due to the estimation errors. To alleviate these problems, in this paper, we propose an effective and kernel-free network, namely DSSR, which enables recurrent detail-structure alternative optimization without blur kernel prior incorporation for blind SR. Specifically, in our DSSR, a detail-structure modulation module (DSMM) is built to exploit the interaction and collaboration of image details and structures. The DSMM consists of two components: a detail restoration unit (DRU) and a structure modulation unit (SMU). The former aims at regressing the intermediate HR detail reconstruction from LR structural contexts, and the latter performs structural contexts modulation conditioned on the learned detail maps at both HR and LR spaces. Besides, we use the output of DSMM as the hidden state and design our DSSR architecture from a recurrent convolutional neural network (RCNN) view. In this way, the network can alternatively optimize the image details and structural contexts, achieving co-optimization across time. Moreover, equipped with the recurrent connection, our DSSR allows low- and high-level feature representations complementary by observing previous HR details and contexts at every unrolling time. Extensive experiments on synthetic datasets and real-world images demonstrate that our method achieves the state-of-the-art against existing methods. The source code can be found at https://github.com/Arcananana/DSSR.
translated by 谷歌翻译
Face super-resolution is a domain-specific image super-resolution, which aims to generate High-Resolution (HR) face images from their Low-Resolution (LR) counterparts. In this paper, we propose a novel face super-resolution method, namely Semantic Encoder guided Generative Adversarial Face Ultra-Resolution Network (SEGA-FURN) to ultra-resolve an unaligned tiny LR face image to its HR counterpart with multiple ultra-upscaling factors (e.g., 4x and 8x). The proposed network is composed of a novel semantic encoder that has the ability to capture the embedded semantics to guide adversarial learning and a novel generator that uses a hierarchical architecture named Residual in Internal Dense Block (RIDB). Moreover, we propose a joint discriminator which discriminates both image data and embedded semantics. The joint discriminator learns the joint probability distribution of the image space and latent space. We also use a Relativistic average Least Squares loss (RaLS) as the adversarial loss to alleviate the gradient vanishing problem and enhance the stability of the training procedure. Extensive experiments on large face datasets have proved that the proposed method can achieve superior super-resolution results and significantly outperform other state-of-the-art methods in both qualitative and quantitative comparisons.
translated by 谷歌翻译
随着深度学习(DL)的出现,超分辨率(SR)也已成为一个蓬勃发展的研究领域。然而,尽管结果有希望,但该领域仍然面临需要进一步研究的挑战,例如,允许灵活地采样,更有效的损失功能和更好的评估指标。我们根据最近的进步来回顾SR的域,并检查最新模型,例如扩散(DDPM)和基于变压器的SR模型。我们对SR中使用的当代策略进行了批判性讨论,并确定了有前途但未开发的研究方向。我们通过纳入该领域的最新发展,例如不确定性驱动的损失,小波网络,神经体系结构搜索,新颖的归一化方法和最新评估技术来补充先前的调查。我们还为整章中的模型和方法提供了几种可视化,以促进对该领域趋势的全球理解。最终,这篇综述旨在帮助研究人员推动DL应用于SR的界限。
translated by 谷歌翻译
To date, the best-performing blind super-resolution (SR) techniques follow one of two paradigms: A) generate and train a standard SR network on synthetic low-resolution - high-resolution (LR - HR) pairs or B) attempt to predict the degradations an LR image has suffered and use these to inform a customised SR network. Despite significant progress, subscribers to the former miss out on useful degradation information that could be used to improve the SR process. On the other hand, followers of the latter rely on weaker SR networks, which are significantly outperformed by the latest architectural advancements. In this work, we present a framework for combining any blind SR prediction mechanism with any deep SR network, using a metadata insertion block to insert prediction vectors into SR network feature maps. Through comprehensive testing, we prove that state-of-the-art contrastive and iterative prediction schemes can be successfully combined with high-performance SR networks such as RCAN and HAN within our framework. We show that our hybrid models consistently achieve stronger SR performance than both their non-blind and blind counterparts. Furthermore, we demonstrate our framework's robustness by predicting degradations and super-resolving images from a complex pipeline of blurring, noise and compression.
translated by 谷歌翻译
现实世界图像超分辨率(SR)的关键挑战是在低分辨率(LR)图像中恢复具有复杂未知降解(例如,下采样,噪声和压缩)的缺失细节。大多数以前的作品还原图像空间中的此类缺失细节。为了应对自然图像的高度多样性,他们要么依靠难以训练和容易训练和伪影的不稳定的甘体,要么诉诸于通常不可用的高分辨率(HR)图像中的明确参考。在这项工作中,我们提出了匹配SR(FEMASR)的功能,该功能在更紧凑的特征空间中恢复了现实的HR图像。与图像空间方法不同,我们的FEMASR通过将扭曲的LR图像{\ IT特征}与我们预读的HR先验中的无失真性HR对应物匹配来恢复HR图像,并解码匹配的功能以获得现实的HR图像。具体而言,我们的人力资源先验包含一个离散的特征代码簿及其相关的解码器,它们在使用量化的生成对抗网络(VQGAN)的HR图像上预估计。值得注意的是,我们在VQGAN中结合了一种新型的语义正则化,以提高重建图像的质量。对于功能匹配,我们首先提取由LR编码器组成的LR编码器的LR功能,然后遵循简单的最近邻居策略,将其与预读的代码簿匹配。特别是,我们为LR编码器配备了与解码器的残留快捷方式连接,这对于优化功能匹配损耗至关重要,还有助于补充可能的功能匹配错误。实验结果表明,我们的方法比以前的方法产生更现实的HR图像。代码以\ url {https://github.com/chaofengc/femasr}发布。
translated by 谷歌翻译
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image superresolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4× upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译
Image Super-Resolution (SR) is essential for a wide range of computer vision and image processing tasks. Investigating infrared (IR) image (or thermal images) super-resolution is a continuing concern within the development of deep learning. This survey aims to provide a comprehensive perspective of IR image super-resolution, including its applications, hardware imaging system dilemmas, and taxonomy of image processing methodologies. In addition, the datasets and evaluation metrics in IR image super-resolution tasks are also discussed. Furthermore, the deficiencies in current technologies and possible promising directions for the community to explore are highlighted. To cope with the rapid development in this field, we intend to regularly update the relevant excellent work at \url{https://github.com/yongsongH/Infrared_Image_SR_Survey
translated by 谷歌翻译
Recent years have witnessed the unprecedented success of deep convolutional neural networks (CNNs) in single image super-resolution (SISR). However, existing CNN-based SISR methods mostly assume that a low-resolution (LR) image is bicubicly downsampled from a high-resolution (HR) image, thus inevitably giving rise to poor performance when the true degradation does not follow this assumption. Moreover, they lack scalability in learning a single model to nonblindly deal with multiple degradations. To address these issues, we propose a general framework with dimensionality stretching strategy that enables a single convolutional super-resolution network to take two key factors of the SISR degradation process, i.e., blur kernel and noise level, as input. Consequently, the super-resolver can handle multiple and even spatially variant degradations, which significantly improves the practicability. Extensive experimental results on synthetic and real LR images show that the proposed convolutional super-resolution network not only can produce favorable results on multiple degradations but also is computationally efficient, providing a highly effective and scalable solution to practical SISR applications.
translated by 谷歌翻译
近年来,压缩图像超分辨率已引起了极大的关注,其中图像被压缩伪像和低分辨率伪影降解。由于复杂的杂化扭曲变形,因此很难通过简单的超分辨率和压缩伪像消除掉的简单合作来恢复扭曲的图像。在本文中,我们向前迈出了一步,提出了层次的SWIN变压器(HST)网络,以恢复低分辨率压缩图像,该图像共同捕获分层特征表示并分别用SWIN Transformer增强每个尺度表示。此外,我们发现具有超分辨率(SR)任务的预处理对于压缩图像超分辨率至关重要。为了探索不同的SR预审查的影响,我们将常用的SR任务(例如,比科比奇和不同的实际超分辨率仿真)作为我们的预处理任务,并揭示了SR在压缩的图像超分辨率中起不可替代的作用。随着HST和预训练的合作,我们的HST在AIM 2022挑战中获得了低质量压缩图像超分辨率轨道的第五名,PSNR为23.51db。广泛的实验和消融研究已经验证了我们提出的方法的有效性。
translated by 谷歌翻译
近年来,面部语义指导(包括面部地标,面部热图和面部解析图)和面部生成对抗网络(GAN)近年来已广泛用于盲面修复(BFR)。尽管现有的BFR方法在普通案例中取得了良好的性能,但这些解决方案在面对严重降解和姿势变化的图像时具有有限的弹性(例如,在现实世界情景中看起来右,左看,笑等)。在这项工作中,我们提出了一个精心设计的盲人面部修复网络,具有生成性面部先验。所提出的网络主要由非对称编解码器和stylegan2先验网络组成。在非对称编解码器中,我们采用混合的多路残留块(MMRB)来逐渐提取输入图像的弱纹理特征,从而可以更好地保留原始面部特征并避免过多的幻想。 MMRB也可以在其他网络中插入插件。此外,多亏了StyleGAN2模型的富裕和多样化的面部先验,我们采用了微调的方法来灵活地恢复自然和现实的面部细节。此外,一种新颖的自我监督训练策略是专门设计用于面部修复任务的,以使分配更接近目标并保持训练稳定性。关于合成和现实世界数据集的广泛实验表明,我们的模型在面部恢复和面部超分辨率任务方面取得了卓越的表现。
translated by 谷歌翻译
尽管基准数据集的成功,但大多数先进的面部超分辨率模型在真实情况下表现不佳,因为真实图像与合成训练对之间的显着域间隙。为了解决这个问题,我们提出了一种用于野外面部超分辨率的新型域 - 自适应降级网络。该降级网络预测流场以及中间低分辨率图像。然后,通过翘曲中间图像来生成降级的对应物。利用捕获运动模糊的偏好,这种模型在保护原始图像和劣化之间保持身份一致性更好地执行。我们进一步提出了超分辨率网络的自我调节块。该块将输入图像作为条件术语,以有效地利用面部结构信息,从而消除了对显式前沿的依赖性,例如,面部地标或边界。我们的模型在Celeba和真实世界的面部数据集上实现了最先进的性能。前者展示了我们所提出的建筑的强大生成能力,而后者展示了现实世界中的良好的身份一致性和感知品质。
translated by 谷歌翻译
本文提出了图像恢复的新变异推理框架和一个卷积神经网络(CNN)结构,该结构可以解决所提出的框架所描述的恢复问题。较早的基于CNN的图像恢复方法主要集中在网络体系结构设计或培训策略上,具有非盲方案,其中已知或假定降解模型。为了更接近现实世界的应用程序,CNN还接受了整个数据集的盲目培训,包括各种降解。然而,给定有多样化的图像的高质量图像的条件分布太复杂了,无法通过单个CNN学习。因此,也有一些方法可以提供其他先验信息来培训CNN。与以前的方法不同,我们更多地专注于基于贝叶斯观点以及如何重新重新重构目标的恢复目标。具体而言,我们的方法放松了原始的后推理问题,以更好地管理子问题,因此表现得像分裂和互动方案。结果,与以前的框架相比,提出的框架提高了几个恢复问题的性能。具体而言,我们的方法在高斯denoising,现实世界中的降噪,盲图超级分辨率和JPEG压缩伪像减少方面提供了最先进的性能。
translated by 谷歌翻译