Face super-resolution is a domain-specific image super-resolution, which aims to generate High-Resolution (HR) face images from their Low-Resolution (LR) counterparts. In this paper, we propose a novel face super-resolution method, namely Semantic Encoder guided Generative Adversarial Face Ultra-Resolution Network (SEGA-FURN) to ultra-resolve an unaligned tiny LR face image to its HR counterpart with multiple ultra-upscaling factors (e.g., 4x and 8x). The proposed network is composed of a novel semantic encoder that has the ability to capture the embedded semantics to guide adversarial learning and a novel generator that uses a hierarchical architecture named Residual in Internal Dense Block (RIDB). Moreover, we propose a joint discriminator which discriminates both image data and embedded semantics. The joint discriminator learns the joint probability distribution of the image space and latent space. We also use a Relativistic average Least Squares loss (RaLS) as the adversarial loss to alleviate the gradient vanishing problem and enhance the stability of the training procedure. Extensive experiments on large face datasets have proved that the proposed method can achieve superior super-resolution results and significantly outperform other state-of-the-art methods in both qualitative and quantitative comparisons.
translated by 谷歌翻译
Deep Convolutional Neural Networks (DCNNs) have exhibited impressive performance on image super-resolution tasks. However, these deep learning-based super-resolution methods perform poorly in real-world super-resolution tasks, where the paired high-resolution and low-resolution images are unavailable and the low-resolution images are degraded by complicated and unknown kernels. To break these limitations, we propose the Unsupervised Bi-directional Cycle Domain Transfer Learning-based Generative Adversarial Network (UBCDTL-GAN), which consists of an Unsupervised Bi-directional Cycle Domain Transfer Network (UBCDTN) and the Semantic Encoder guided Super Resolution Network (SESRN). First, the UBCDTN is able to produce an approximated real-like LR image through transferring the LR image from an artificially degraded domain to the real-world LR image domain. Second, the SESRN has the ability to super-resolve the approximated real-like LR image to a photo-realistic HR image. Extensive experiments on unpaired real-world image benchmark datasets demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods.
translated by 谷歌翻译
面部超分辨率(FSR),也称为面部幻觉,其旨在增强低分辨率(LR)面部图像以产生高分辨率(HR)面部图像的分辨率,是特定于域的图像超分辨率问题。最近,FSR获得了相当大的关注,并目睹了深度学习技术的发展炫目。迄今为止,有很少有基于深入学习的FSR的研究摘要。在本次调查中,我们以系统的方式对基于深度学习的FSR方法进行了全面审查。首先,我们总结了FSR的问题制定,并引入了流行的评估度量和损失功能。其次,我们详细说明了FSR中使用的面部特征和流行数据集。第三,我们根据面部特征的利用大致分类了现有方法。在每个类别中,我们从设计原则的一般描述开始,然后概述代表方法,然后讨论其中的利弊。第四,我们评估了一些最先进的方法的表现。第五,联合FSR和其他任务以及与FSR相关的申请大致介绍。最后,我们设想了这一领域进一步的技术进步的前景。在\ URL {https://github.com/junjun-jiang/face-hallucination-benchmark}上有一个策划的文件和资源的策划文件和资源清单
translated by 谷歌翻译
Face Restoration (FR) aims to restore High-Quality (HQ) faces from Low-Quality (LQ) input images, which is a domain-specific image restoration problem in the low-level computer vision area. The early face restoration methods mainly use statistic priors and degradation models, which are difficult to meet the requirements of real-world applications in practice. In recent years, face restoration has witnessed great progress after stepping into the deep learning era. However, there are few works to study deep learning-based face restoration methods systematically. Thus, this paper comprehensively surveys recent advances in deep learning techniques for face restoration. Specifically, we first summarize different problem formulations and analyze the characteristic of the face image. Second, we discuss the challenges of face restoration. Concerning these challenges, we present a comprehensive review of existing FR methods, including prior based methods and deep learning-based methods. Then, we explore developed techniques in the task of FR covering network architectures, loss functions, and benchmark datasets. We also conduct a systematic benchmark evaluation on representative methods. Finally, we discuss future directions, including network designs, metrics, benchmark datasets, applications,etc. We also provide an open-source repository for all the discussed methods, which is available at https://github.com/TaoWangzj/Awesome-Face-Restoration.
translated by 谷歌翻译
面部超分辨率方法的性能依赖于它们有效地收回面部结构和突出特征的能力。尽管卷积神经网络和基于生成的对抗网络的方法在面对幻觉任务中提供令人印象深刻的性能,但使用与低分辨率图像相关的属性来提高性能的能力是不令人满意的。在本文中,我们提出了一种属性引导的注意力发生抗体网络,该受体对抗网络采用新的属性引导的注意力(AGA)模块来识别和聚焦图像中各种面部特征的生成过程。堆叠多个AGA模块可以恢复高电平的高级面部结构。我们设计鉴别者来学习利用高分辨率图像与其相应的面部属性注释之间关系的鉴别特征。然后,我们探索基于U-Net的架构来改进现有预测并综合进一步的面部细节。跨越几个指标的广泛实验表明,我们的AGA-GaN和Aga-GaN + U-Net框架优于其他几种最先进的幻觉的方法。我们还演示了我们的方法的可行性,当每个属性描述符未知并因此建立其在真实情景中的应用程序时。
translated by 谷歌翻译
现实的高光谱图像(HSI)超分辨率(SR)技术旨在从其低分辨率(LR)对应物中产生具有更高光谱和空间忠诚的高分辨率(HR)HSI。生成的对抗网络(GAN)已被证明是图像超分辨率的有效深入学习框架。然而,现有GaN的模型的优化过程经常存在模式崩溃问题,导致光谱间不变重建容量有限。这可能导致所生成的HSI上的光谱空间失真,尤其是具有大的升级因子。为了缓解模式崩溃的问题,这项工作提出了一种与潜在编码器(Le-GaN)耦合的新型GaN模型,其可以将产生的光谱空间特征从图像空间映射到潜在空间并产生耦合组件正规化生成的样本。基本上,我们将HSI视为嵌入在潜在空间中的高维歧管。因此,GaN模型的优化被转换为学习潜在空间中的高分辨率HSI样本的分布的问题,使得产生的超分辨率HSI的分布更接近其原始高分辨率对应物的那些。我们对超级分辨率的模型性能进行了实验评估及其在缓解模式崩溃中的能力。基于具有不同传感器(即Aviris和UHD-185)的两种实际HSI数据集进行了测试和验证,用于各种升高因素并增加噪声水平,并与最先进的超分辨率模型相比(即Hyconet,LTTR,Bagan,SR-GaN,Wgan)。
translated by 谷歌翻译
The Super-Resolution Generative Adversarial Network (SR-GAN) [1] is a seminal work that is capable of generating realistic textures during single image super-resolution. However, the hallucinated details are often accompanied with unpleasant artifacts. To further enhance the visual quality, we thoroughly study three key components of SRGANnetwork architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN). In particular, we introduce the Residual-in-Residual Dense Block (RRDB) without batch normalization as the basic network building unit. Moreover, we borrow the idea from relativistic GAN [2] to let the discriminator predict relative realness instead of the absolute value. Finally, we improve the perceptual loss by using the features before activation, which could provide stronger supervision for brightness consistency and texture recovery. Benefiting from these improvements, the proposed ESRGAN achieves consistently better visual quality with more realistic and natural textures than SRGAN and won the first place in the PIRM2018-SR Challenge 1 [3]. The code is available at https://github.com/xinntao/ESRGAN.
translated by 谷歌翻译
Convolutional Neural Network (CNN)-based image super-resolution (SR) has exhibited impressive success on known degraded low-resolution (LR) images. However, this type of approach is hard to hold its performance in practical scenarios when the degradation process is unknown. Despite existing blind SR methods proposed to solve this problem using blur kernel estimation, the perceptual quality and reconstruction accuracy are still unsatisfactory. In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model. We propose a components decomposition and co-optimization network (CDCN) for blind SR. Firstly, CDCN decomposes the input LR image into structure and detail components in feature space. Then, the mutual collaboration block (MCB) is presented to exploit the relationship between both two components. In this way, the detail component can provide informative features to enrich the structural context and the structure component can carry structural context for better detail revealing via a mutual complementary manner. After that, we present a degradation-driven learning strategy to jointly supervise the HR image detail and structure restoration process. Finally, a multi-scale fusion module followed by an upsampling layer is designed to fuse the structure and detail features and perform SR reconstruction. Empowered by such degradation-based components decomposition, collaboration, and mutual optimization, we can bridge the correlation between component learning and degradation modelling for blind SR, thereby producing SR results with more accurate textures. Extensive experiments on both synthetic SR datasets and real-world images show that the proposed method achieves the state-of-the-art performance compared to existing methods.
translated by 谷歌翻译
盲人恢复通常会遇到各种规模的面孔输入,尤其是在现实世界中。但是,当前的大多数作品都支持特定的规模面,这限制了其在现实情况下的应用能力。在这项工作中,我们提出了一个新颖的尺度感知盲人面部修复框架,名为FaceFormer,该框架将面部特征恢复作为比例感知转换。所提出的面部特征上采样(FFUP)模块基于原始的比例比例动态生成UPSMPLING滤波器,这有助于我们的网络适应任意面部尺度。此外,我们进一步提出了面部特征嵌入(FFE)模块,该模块利用变压器来层次提取面部潜在的多样性和鲁棒性。因此,我们的脸部形式实现了富裕性和稳健性,恢复了面部的面孔,对面部成分具有现实和对称的细节。广泛的实验表明,我们提出的使用合成数据集训练的方法比当前的最新图像更好地推广到天然低质量的图像。
translated by 谷歌翻译
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image superresolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4× upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
深度神经网络极大地促进了单图超分辨率(SISR)的性能。传统方法仍然仅基于图像模态的输入来恢复单个高分辨率(HR)解决方案。但是,图像级信息不足以预测大型展望因素面临的足够细节和光真逼真的视觉质量(x8,x16)。在本文中,我们提出了一种新的视角,将SISR视为语义图像详细信息增强问题,以产生忠于地面真理的语义合理的HR图像。为了提高重建图像的语义精度和视觉质量,我们通过提出文本指导的超分辨率(TGSR)框架来探索SISR中的多模式融合学习,该框架可以从文本和图像模态中有效地利用信息。与现有方法不同,提出的TGSR可以生成通过粗到精细过程匹配文本描述的HR图像详细信息。广泛的实验和消融研究证明了TGSR的效果,该效果利用文本参考来恢复逼真的图像。
translated by 谷歌翻译
成功地应用生成的对抗性网络(GaN)以研究感知单个图像超级度(SISR)。然而,GaN经常倾向于产生具有高频率细节的图像与真实的细节不一致。灵感来自传统细节增强算法,我们提出了一种新的先前知识,先前的细节,帮助GaN减轻这个问题并恢复更现实的细节。所提出的方法名为DSRAN,包括良好设计的详细提取算法,用于捕获图像中最重要的高频信息。然后,两种鉴别器分别用于在图像域和细节域修复上进行监督。 DSRGAN通过细节增强方式将恢复的细节合并到最终输出中。 DSRGAN的特殊设计从基于模型的常规算法和数据驱动的深度学习网络中获得了优势。实验结果表明,DSRGAN在感知度量上表现出最先进的SISR方法,并同时达到保真度量的可比结果。在DSRGAN之后,将其他传统的图像处理算法结合到深度学习网络中,以形成基于模型的深SISR。
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译
近年来,面部语义指导(包括面部地标,面部热图和面部解析图)和面部生成对抗网络(GAN)近年来已广泛用于盲面修复(BFR)。尽管现有的BFR方法在普通案例中取得了良好的性能,但这些解决方案在面对严重降解和姿势变化的图像时具有有限的弹性(例如,在现实世界情景中看起来右,左看,笑等)。在这项工作中,我们提出了一个精心设计的盲人面部修复网络,具有生成性面部先验。所提出的网络主要由非对称编解码器和stylegan2先验网络组成。在非对称编解码器中,我们采用混合的多路残留块(MMRB)来逐渐提取输入图像的弱纹理特征,从而可以更好地保留原始面部特征并避免过多的幻想。 MMRB也可以在其他网络中插入插件。此外,多亏了StyleGAN2模型的富裕和多样化的面部先验,我们采用了微调的方法来灵活地恢复自然和现实的面部细节。此外,一种新颖的自我监督训练策略是专门设计用于面部修复任务的,以使分配更接近目标并保持训练稳定性。关于合成和现实世界数据集的广泛实验表明,我们的模型在面部恢复和面部超分辨率任务方面取得了卓越的表现。
translated by 谷歌翻译
尽管基准数据集的成功,但大多数先进的面部超分辨率模型在真实情况下表现不佳,因为真实图像与合成训练对之间的显着域间隙。为了解决这个问题,我们提出了一种用于野外面部超分辨率的新型域 - 自适应降级网络。该降级网络预测流场以及中间低分辨率图像。然后,通过翘曲中间图像来生成降级的对应物。利用捕获运动模糊的偏好,这种模型在保护原始图像和劣化之间保持身份一致性更好地执行。我们进一步提出了超分辨率网络的自我调节块。该块将输入图像作为条件术语,以有效地利用面部结构信息,从而消除了对显式前沿的依赖性,例如,面部地标或边界。我们的模型在Celeba和真实世界的面部数据集上实现了最先进的性能。前者展示了我们所提出的建筑的强大生成能力,而后者展示了现实世界中的良好的身份一致性和感知品质。
translated by 谷歌翻译
具有高分辨率的视网膜光学相干断层扫描术(八八)对于视网膜脉管系统的定量和分析很重要。然而,八颗图像的分辨率与相同采样频率的视野成反比,这不利于临床医生分析较大的血管区域。在本文中,我们提出了一个新型的基于稀疏的域适应超分辨率网络(SASR),以重建现实的6x6 mm2/低分辨率/低分辨率(LR)八八粒图像,以重建高分辨率(HR)表示。更具体地说,我们首先对3x3 mm2/高分辨率(HR)图像进行简单降解,以获得合成的LR图像。然后,采用一种有效的注册方法在6x6 mm2图像中以其相应的3x3 mm2图像区域注册合成LR,以获得裁切的逼真的LR图像。然后,我们提出了一个多级超分辨率模型,用于对合成数据进行全面监督的重建,从而通过生成的对流策略指导现实的LR图像重建现实的LR图像,该策略允许合成和现实的LR图像可以在特征中统一。领域。最后,新型的稀疏边缘感知损失旨在动态优化容器边缘结构。在两个八八集中进行的广泛实验表明,我们的方法的性能优于最先进的超分辨率重建方法。此外,我们还研究了重建结果对视网膜结构分割的性能,这进一步验证了我们方法的有效性。
translated by 谷歌翻译
超级分辨率(SR)是低级视觉区域的基本和代表任务。通常认为,从SR网络中提取的特征没有特定的语义信息,并且网络只能从输入到输出中学习复杂的非线性映射。我们可以在SR网络中找到任何“语义”吗?在本文中,我们为此问题提供了肯定的答案。通过分析具有维度降低和可视化的特征表示,我们成功地发现了SR网络中的深度语义表示,\ Texit {i.},深度劣化表示(DDR),其与图像劣化类型和度数相关。我们还揭示了分类和SR网络之间的表示语义的差异。通过广泛的实验和分析,我们得出一系列观测和结论,对未来的工作具有重要意义,例如解释低级CNN网络的内在机制以及开发盲人SR的新评估方法。
translated by 谷歌翻译
现实世界图像超分辨率(SR)的关键挑战是在低分辨率(LR)图像中恢复具有复杂未知降解(例如,下采样,噪声和压缩)的缺失细节。大多数以前的作品还原图像空间中的此类缺失细节。为了应对自然图像的高度多样性,他们要么依靠难以训练和容易训练和伪影的不稳定的甘体,要么诉诸于通常不可用的高分辨率(HR)图像中的明确参考。在这项工作中,我们提出了匹配SR(FEMASR)的功能,该功能在更紧凑的特征空间中恢复了现实的HR图像。与图像空间方法不同,我们的FEMASR通过将扭曲的LR图像{\ IT特征}与我们预读的HR先验中的无失真性HR对应物匹配来恢复HR图像,并解码匹配的功能以获得现实的HR图像。具体而言,我们的人力资源先验包含一个离散的特征代码簿及其相关的解码器,它们在使用量化的生成对抗网络(VQGAN)的HR图像上预估计。值得注意的是,我们在VQGAN中结合了一种新型的语义正则化,以提高重建图像的质量。对于功能匹配,我们首先提取由LR编码器组成的LR编码器的LR功能,然后遵循简单的最近邻居策略,将其与预读的代码簿匹配。特别是,我们为LR编码器配备了与解码器的残留快捷方式连接,这对于优化功能匹配损耗至关重要,还有助于补充可能的功能匹配错误。实验结果表明,我们的方法比以前的方法产生更现实的HR图像。代码以\ url {https://github.com/chaofengc/femasr}发布。
translated by 谷歌翻译
盲目图像超分辨率(SR)是CV的长期任务,旨在恢复患有未知和复杂扭曲的低分辨率图像。最近的工作主要集中在采用更复杂的退化模型来模拟真实世界的降级。由此产生的模型在感知损失和产量感知令人信服的结果取得了突破性。然而,电流生成的对抗性网络结构所带来的限制仍然是显着的:处理像素同样地导致图像的结构特征的无知,并且导致性能缺点,例如扭曲线和背景过度锐化或模糊。在本文中,我们提出了A-ESRAN,用于盲人SR任务的GAN模型,其特色是基于U-NET的U-NET的多尺度鉴别器,可以与其他发电机无缝集成。据我们所知,这是第一项介绍U-Net结构作为GaN解决盲人问题的鉴别者的工作。本文还给出了对模型的多规模注意力突破的机制的解释。通过对现有作品的比较实验,我们的模型在非参考自然图像质量评估员度量上提出了最先进的水平性能。我们的消融研究表明,利用我们的鉴别器,基于RRDB的发电机可以利用多种尺度中图像的结构特征,因此与先前作品相比,更加感知地产生了感知的高分辨率图像。
translated by 谷歌翻译