Despite that convolutional neural networks (CNN) have recently demonstrated high-quality reconstruction for single-image super-resolution (SR), recovering natural and realistic texture remains a challenging problem. In this paper, we show that it is possible to recover textures faithful to semantic classes. In particular, we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps. This is made possible through a novel Spatial Feature Transform (SFT) layer that generates affine transformation parameters for spatial-wise feature modulation. SFT layers can be trained end-to-end together with the SR network using the same loss function. During testing, it accepts an input image of arbitrary size and generates a high-resolution image with just a single forward pass conditioned on the categorical priors. Our final results show that an SR network equipped with SFT can generate more realistic and visually pleasing textures in comparison to state-of-the-art SRGAN [27] and EnhanceNet [38].
translated by 谷歌翻译
最近的研究通过卷积神经网络(CNNS)显着提高了单图像超分辨率(SR)的性能。虽然可以有许多用于给定输入的高分辨率(HR)解决方案,但大多数现有的基于CNN的方法在推理期间不会探索替代解决方案。获得替代SR结果的典型方法是培训具有不同丢失权重的多个SR模型,并利用这些模型的组合。我们通过利用多任务学习,我们提出了一种更有效的方法来培训单个可调SR模型的单一可调SR模型。具体地,我们在训练期间优化具有条件目标的SR模型,其中目标是不同特征级别的多个感知损失的加权之和。权重根据给定条件而变化,并且该组重量被定义为样式控制器。此外,我们提出了一种适用于该训练方案的架构,该架构是配备有空间特征变换层的残留残余密集块。在推理阶段,我们培训的模型可以在样式控制地图上生成局部不同的输出。广泛的实验表明,所提出的SR模型在没有伪影的情况下产生各种所需的重建,并对最先进的SR方法产生相当的定量性能。
translated by 谷歌翻译
The Super-Resolution Generative Adversarial Network (SR-GAN) [1] is a seminal work that is capable of generating realistic textures during single image super-resolution. However, the hallucinated details are often accompanied with unpleasant artifacts. To further enhance the visual quality, we thoroughly study three key components of SRGANnetwork architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN). In particular, we introduce the Residual-in-Residual Dense Block (RRDB) without batch normalization as the basic network building unit. Moreover, we borrow the idea from relativistic GAN [2] to let the discriminator predict relative realness instead of the absolute value. Finally, we improve the perceptual loss by using the features before activation, which could provide stronger supervision for brightness consistency and texture recovery. Benefiting from these improvements, the proposed ESRGAN achieves consistently better visual quality with more realistic and natural textures than SRGAN and won the first place in the PIRM2018-SR Challenge 1 [3]. The code is available at https://github.com/xinntao/ESRGAN.
translated by 谷歌翻译
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image superresolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4× upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
translated by 谷歌翻译
Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce over-smoothed images that lack highfrequency textures and do not look natural despite yielding high PSNR values.We propose a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.
translated by 谷歌翻译
现实世界图像超分辨率(SR)的关键挑战是在低分辨率(LR)图像中恢复具有复杂未知降解(例如,下采样,噪声和压缩)的缺失细节。大多数以前的作品还原图像空间中的此类缺失细节。为了应对自然图像的高度多样性,他们要么依靠难以训练和容易训练和伪影的不稳定的甘体,要么诉诸于通常不可用的高分辨率(HR)图像中的明确参考。在这项工作中,我们提出了匹配SR(FEMASR)的功能,该功能在更紧凑的特征空间中恢复了现实的HR图像。与图像空间方法不同,我们的FEMASR通过将扭曲的LR图像{\ IT特征}与我们预读的HR先验中的无失真性HR对应物匹配来恢复HR图像,并解码匹配的功能以获得现实的HR图像。具体而言,我们的人力资源先验包含一个离散的特征代码簿及其相关的解码器,它们在使用量化的生成对抗网络(VQGAN)的HR图像上预估计。值得注意的是,我们在VQGAN中结合了一种新型的语义正则化,以提高重建图像的质量。对于功能匹配,我们首先提取由LR编码器组成的LR编码器的LR功能,然后遵循简单的最近邻居策略,将其与预读的代码簿匹配。特别是,我们为LR编码器配备了与解码器的残留快捷方式连接,这对于优化功能匹配损耗至关重要,还有助于补充可能的功能匹配错误。实验结果表明,我们的方法比以前的方法产生更现实的HR图像。代码以\ url {https://github.com/chaofengc/femasr}发布。
translated by 谷歌翻译
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image superresolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.
translated by 谷歌翻译
联合超分辨率和反音调映射(SR-ITM)旨在提高具有分辨率和动态范围具有质量缺陷的视频的视觉质量。当使用4K高动态范围(HDR)电视来观看低分辨率标准动态范围(LR SDR)视频时,就会出现此问题。以前依赖于学习本地信息的方法通常在保留颜色合规性和远程结构相似性方面做得很好,从而导致了不自然的色彩过渡和纹理伪像。为了应对这些挑战,我们建议联合SR-ITM的全球先验指导的调制网络(GPGMNET)。特别是,我们设计了一个全球先验提取模块(GPEM),以提取颜色合规性和结构相似性,分别对ITM和SR任务有益。为了进一步利用全球先验并保留空间信息,我们使用一些用于中间特征调制的参数,设计多个全球先验的指导空间调制块(GSMB),其中调制参数由共享的全局先验和空间特征生成来自空间金字塔卷积块(SPCB)的地图。通过这些精心设计的设计,GPGMNET可以通过较低的计算复杂性实现更高的视觉质量。广泛的实验表明,我们提出的GPGMNET优于最新方法。具体而言,我们提出的模型在PSNR中超过了0.64 dB的最新模型,其中69 $ \%$ $ $较少,3.1 $ \ times $ speedup。该代码将很快发布。
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
成功地应用生成的对抗性网络(GaN)以研究感知单个图像超级度(SISR)。然而,GaN经常倾向于产生具有高频率细节的图像与真实的细节不一致。灵感来自传统细节增强算法,我们提出了一种新的先前知识,先前的细节,帮助GaN减轻这个问题并恢复更现实的细节。所提出的方法名为DSRAN,包括良好设计的详细提取算法,用于捕获图像中最重要的高频信息。然后,两种鉴别器分别用于在图像域和细节域修复上进行监督。 DSRGAN通过细节增强方式将恢复的细节合并到最终输出中。 DSRGAN的特殊设计从基于模型的常规算法和数据驱动的深度学习网络中获得了优势。实验结果表明,DSRGAN在感知度量上表现出最先进的SISR方法,并同时达到保真度量的可比结果。在DSRGAN之后,将其他传统的图像处理算法结合到深度学习网络中,以形成基于模型的深SISR。
translated by 谷歌翻译
随着深度学习(DL)的出现,超分辨率(SR)也已成为一个蓬勃发展的研究领域。然而,尽管结果有希望,但该领域仍然面临需要进一步研究的挑战,例如,允许灵活地采样,更有效的损失功能和更好的评估指标。我们根据最近的进步来回顾SR的域,并检查最新模型,例如扩散(DDPM)和基于变压器的SR模型。我们对SR中使用的当代策略进行了批判性讨论,并确定了有前途但未开发的研究方向。我们通过纳入该领域的最新发展,例如不确定性驱动的损失,小波网络,神经体系结构搜索,新颖的归一化方法和最新评估技术来补充先前的调查。我们还为整章中的模型和方法提供了几种可视化,以促进对该领域趋势的全球理解。最终,这篇综述旨在帮助研究人员推动DL应用于SR的界限。
translated by 谷歌翻译
超级分辨率是一个不良问题,其中基本真理的高分辨率图像仅代表合理解决方案的空间中的一种可能性。然而,主导范式是采用像素 - 明智的损失,例如L_1,其驱动预测模糊的平均值。当与对抗性损失相结合时,这导致了根本相互矛盾的目标,这降低了最终质量。我们通过重新审视L_1丢失来解决此问题,并表明它对应于单层条件流程。灵感来自这一关系,我们探讨了一般流动作为L_1目标的忠诚替代品。我们证明,在与对抗性损失结合时,更深流量的灵活性导致更好的视觉质量和一致性。我们对三个数据集和比例因子进行广泛的用户研究,其中我们的方法被证明了为光逼真的超分辨率优于最先进的方法。代码和培训的型号可在:git.io/adflow
translated by 谷歌翻译
我们考虑单个图像超分辨率(SISR)问题,其中基于低分辨率(LR)输入产生高分辨率(HR)图像。最近,生成的对抗性网络(GANS)变得幻觉细节。大多数沿着这条线的方法依赖于预定义的单个LR-intle-hr映射,这对于SISR任务来说是足够灵活的。此外,GaN生成的假细节可能经常破坏整个图像的现实主义。我们通过为Rich-Detail SISR提出最好的伙伴GANS(Beby-GaN)来解决这些问题。放松不变的一对一的约束,我们允许估计的贴片在培训期间动态寻求最佳监督,这有利于产生更合理的细节。此外,我们提出了一种区域感知的对抗性学习策略,指导我们的模型专注于自适应地为纹理区域发电细节。广泛的实验证明了我们方法的有效性。还构建了超高分辨率4K数据集以促进未来的超分辨率研究。
translated by 谷歌翻译
尽管应用于自然图像的大量成功的超分辨率重建(SRR)模型,但它们在遥感图像中的应用往往会产生差的结果。遥感图像通常比自然图像更复杂,并且具有较低分辨率的特殊性,它包含噪音,并且通常描绘了大质感表面。结果,将非专业的SRR模型应用于遥感图像,从而导致人工制品和不良的重建。为了解决这些问题,本文提出了一种受到先前研究工作启发的体系结构,引入了一种新的方法来迫使SRR模型输出现实的遥感图像:而不是依靠功能空间相似性作为感知损失,而是将其视为Pixel-从图像的归一化数字表面模型(NDSM)推断出的级别信息。该策略允许在训练模型期间应用更具信息的更新,该模型从任务(高程图推理)源中源,该模型与遥感密切相关。但是,在生产过程中不需要NDSM辅助信息,因此该模型除了其低分辨率对以外没有任何其他数据,因此该模型还没有任何其他数据。我们在两个远程感知的不同空间分辨率的数据集上评估了我们的模型,这些数据集也包含图像的DSM对:DFC2018数据集和包含卢森堡国家激光雷达飞行的数据集。根据视觉检查,推断的超分辨率图像表现出特别优越的质量。特别是,高分辨率DFC2018数据集的结果是现实的,几乎与地面真相图像没有区别。
translated by 谷歌翻译
我们表明,诸如Stylegan和Biggan之类的预训练的生成对抗网络(GAN)可以用作潜在银行,以提高图像超分辨率的性能。尽管大多数现有面向感知的方法试图通过以对抗性损失学习来产生现实的产出,但我们的方法,即生成的潜在银行(GLEAN),通过直接利用预先训练的gan封装的丰富而多样的先验来超越现有实践。但是,与需要在运行时需要昂贵的图像特定优化的普遍的GAN反演方法不同,我们的方法只需要单个前向通行证才能修复。可以轻松地将Glean合并到具有多分辨率Skip连接的简单编码器银行decoder架构中。采用来自不同生成模型的先验,可以将收集到各种类别(例如人的面孔,猫,建筑物和汽车)。我们进一步提出了一个轻巧的Glean,名为Lightglean,该版本仅保留Glean中的关键组成部分。值得注意的是,Lightglean仅由21%的参数和35%的拖鞋组成,同时达到可比的图像质量。我们将方法扩展到不同的任务,包括图像着色和盲图恢复,广泛的实验表明,与现有方法相比,我们提出的模型表现出色。代码和模型可在https://github.com/open-mmlab/mmediting上找到。
translated by 谷歌翻译
面部超分辨率(FSR),也称为面部幻觉,其旨在增强低分辨率(LR)面部图像以产生高分辨率(HR)面部图像的分辨率,是特定于域的图像超分辨率问题。最近,FSR获得了相当大的关注,并目睹了深度学习技术的发展炫目。迄今为止,有很少有基于深入学习的FSR的研究摘要。在本次调查中,我们以系统的方式对基于深度学习的FSR方法进行了全面审查。首先,我们总结了FSR的问题制定,并引入了流行的评估度量和损失功能。其次,我们详细说明了FSR中使用的面部特征和流行数据集。第三,我们根据面部特征的利用大致分类了现有方法。在每个类别中,我们从设计原则的一般描述开始,然后概述代表方法,然后讨论其中的利弊。第四,我们评估了一些最先进的方法的表现。第五,联合FSR和其他任务以及与FSR相关的申请大致介绍。最后,我们设想了这一领域进一步的技术进步的前景。在\ URL {https://github.com/junjun-jiang/face-hallucination-benchmark}上有一个策划的文件和资源的策划文件和资源清单
translated by 谷歌翻译
Single-image super-resolution (SISR) networks trained with perceptual and adversarial losses provide high-contrast outputs compared to those of networks trained with distortion-oriented losses, such as L1 or L2. However, it has been shown that using a single perceptual loss is insufficient for accurately restoring locally varying diverse shapes in images, often generating undesirable artifacts or unnatural details. For this reason, combinations of various losses, such as perceptual, adversarial, and distortion losses, have been attempted, yet it remains challenging to find optimal combinations. Hence, in this paper, we propose a new SISR framework that applies optimal objectives for each region to generate plausible results in overall areas of high-resolution outputs. Specifically, the framework comprises two models: a predictive model that infers an optimal objective map for a given low-resolution (LR) input and a generative model that applies a target objective map to produce the corresponding SR output. The generative model is trained over our proposed objective trajectory representing a set of essential objectives, which enables the single network to learn various SR results corresponding to combined losses on the trajectory. The predictive model is trained using pairs of LR images and corresponding optimal objective maps searched from the objective trajectory. Experimental results on five benchmarks show that the proposed method outperforms state-of-the-art perception-driven SR methods in LPIPS, DISTS, PSNR, and SSIM metrics. The visual results also demonstrate the superiority of our method in perception-oriented reconstruction. The code and models are available at https://github.com/seungho-snu/SROOE.
translated by 谷歌翻译
We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.
translated by 谷歌翻译
尽管深度学习使图像介绍方面取得了巨大的飞跃,但当前的方法通常无法综合现实的高频细节。在本文中,我们建议将超分辨率应用于粗糙的重建输出,以高分辨率进行精炼,然后将输出降低到原始分辨率。通过将高分辨率图像引入改进网络,我们的框架能够重建更多的细节,这些细节通常由于光谱偏置而被平滑 - 神经网络倾向于比高频更好地重建低频。为了协助培训大型高度孔洞的改进网络,我们提出了一种渐进的学习技术,其中缺失区域的大小随着培训的进行而增加。我们的缩放,完善和缩放策略,结合了高分辨率的监督和渐进学习,构成了一种框架 - 不合时宜的方法,用于增强高频细节,可应用于任何基于CNN的涂层方法。我们提供定性和定量评估以及消融分析,以显示我们方法的有效性。这种看似简单但功能强大的方法优于最先进的介绍方法。我们的代码可在https://github.com/google/zoom-to-inpaint中找到
translated by 谷歌翻译
盲目图像超分辨率(SR)是CV的长期任务,旨在恢复患有未知和复杂扭曲的低分辨率图像。最近的工作主要集中在采用更复杂的退化模型来模拟真实世界的降级。由此产生的模型在感知损失和产量感知令人信服的结果取得了突破性。然而,电流生成的对抗性网络结构所带来的限制仍然是显着的:处理像素同样地导致图像的结构特征的无知,并且导致性能缺点,例如扭曲线和背景过度锐化或模糊。在本文中,我们提出了A-ESRAN,用于盲人SR任务的GAN模型,其特色是基于U-NET的U-NET的多尺度鉴别器,可以与其他发电机无缝集成。据我们所知,这是第一项介绍U-Net结构作为GaN解决盲人问题的鉴别者的工作。本文还给出了对模型的多规模注意力突破的机制的解释。通过对现有作品的比较实验,我们的模型在非参考自然图像质量评估员度量上提出了最先进的水平性能。我们的消融研究表明,利用我们的鉴别器,基于RRDB的发电机可以利用多种尺度中图像的结构特征,因此与先前作品相比,更加感知地产生了感知的高分辨率图像。
translated by 谷歌翻译