Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods.
translated by 谷歌翻译
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image superresolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.
translated by 谷歌翻译
We propose a deep learning method for single image superresolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) [15] that takes the lowresolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage.
translated by 谷歌翻译
Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce over-smoothed images that lack highfrequency textures and do not look natural despite yielding high PSNR values.We propose a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.
translated by 谷歌翻译
Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image superresolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4× upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
translated by 谷歌翻译
We present a highly accurate single-image superresolution (SR) method. Our method uses a very deep convolutional network inspired by VGG-net used for ImageNet classification [19]. We find increasing our network depth shows a significant improvement in accuracy. Our final model uses 20 weight layers. By cascading small filters many times in a deep network structure, contextual information over large image regions is exploited in an efficient way. With very deep networks, however, convergence speed becomes a critical issue during training. We propose a simple yet effective training procedure. We learn residuals only and use extremely high learning rates (10 4 times higher than SRCNN [6]) enabled by adjustable gradient clipping. Our proposed method performs better than existing methods in accuracy and visual improvements in our results are easily noticeable.
translated by 谷歌翻译
自从Dong等人的第一个成功以来,基于深度学习的方法已在单像超分辨率领域中占主导地位。这取代了使用深神经网络的传统基于稀疏编码方法的所有手工图像处理步骤。与明确创建高/低分辨率词典的基于稀疏编码的方法相反,基于深度学习的方法中的词典被隐式地作为多种卷积的非线性组合被隐式获取。基于深度学习方法的缺点是,它们的性能因与训练数据集(室外图像)不同的图像而降低。我们提出了一个带有深层字典(SRDD)的端到端超分辨率网络,在该网络中,高分辨率词典在不牺牲深度学习优势的情况下明确学习。广泛的实验表明,高分辨率词典的显式学习使网络在维持内域测试图像的性能的同时更加强大。
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
The feed-forward architectures of recently proposed deep super-resolution networks learn representations of low-resolution inputs, and the non-linear mapping from those to high-resolution output. However, this approach does not fully address the mutual dependencies of low-and high-resolution images. We propose Deep Back-Projection Networks (DBPN), that exploit iterative up-and downsampling layers, providing an error feedback mechanism for projection errors at each stage. We construct mutuallyconnected up-and down-sampling stages each of which represents different types of image degradation and highresolution components. We show that extending this idea to allow concatenation of features across up-and downsampling stages (Dense DBPN) allows us to reconstruct further improve super-resolution, yielding superior results and in particular establishing new state of the art results for large scaling factors such as 8× across multiple data sets.
translated by 谷歌翻译
单个图像超分辨率(SISR)是一个不良问题,旨在获得从低分辨率(LR)输入的高分辨率(HR)输出,在此期间应该添加额外的高频信息以改善感知质量。现有的SISR工作主要通过最小化平均平方重建误差来在空间域中运行。尽管高峰峰值信噪比(PSNR)结果,但难以确定模型是否正确地添加所需的高频细节。提出了一些基于基于残余的结构,以指导模型暗示高频率特征。然而,由于空间域度量的解释是有限的,如何验证这些人为细节的保真度仍然是一个问题。在本文中,我们提出了频率域视角来的直观管道,解决了这个问题。由现有频域的工作启发,我们将图像转换为离散余弦变换(DCT)块,然后改革它们以获取DCT功能映射,它用作我们模型的输入和目标。设计了专门的管道,我们进一步提出了符合频域任务的性质的频率损失功能。我们的SISR方法在频域中可以明确地学习高频信息,为SR图像提供保真度和良好的感知质量。我们进一步观察到我们的模型可以与其他空间超分辨率模型合并,以提高原始SR输出的质量。
translated by 谷歌翻译
We propose an image super-resolution method (SR) using a deeply-recursive convolutional network (DRCN). Our network has a very deep recursive layer (up to 16 recursions). Increasing recursion depth can improve performance without introducing new parameters for additional convolutions. Albeit advantages, learning a DRCN is very hard with a standard gradient descent method due to exploding/vanishing gradients. To ease the difficulty of training, we propose two extensions: recursive-supervision and skip-connection. Our method outperforms previous methods by a large margin.
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译
随着移动设备的普及,例如智能手机和可穿戴设备,更轻,更快的型号对于应用视频超级分辨率至关重要。但是,大多数以前的轻型模型倾向于集中于减少台式GPU模型推断的范围,这在当前的移动设备中可能不会节能。在本文中,我们提出了极端低功率超级分辨率(ELSR)网络,该网络仅在移动设备中消耗少量的能量。采用预训练和填充方法来提高极小模型的性能。广泛的实验表明,我们的方法在恢复质量和功耗之间取得了良好的平衡。最后,我们在目标总经理Dimenty 9000 PlantForm上,PSNR 27.34 dB和功率为0.09 w/30fps的竞争分数为90.9,在移动AI&AIM 2022实时视频超级分辨率挑战中排名第一。
translated by 谷歌翻译
Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge [26].
translated by 谷歌翻译
具有强大学习能力的CNN被广泛选择以解决超分辨率问题。但是,CNN依靠更深的网络体系结构来提高图像超分辨率的性能,这可能会增加计算成本。在本文中,我们提出了一个增强的超分辨率组CNN(ESRGCNN),具有浅层架构,通过完全融合了深层和宽的通道特征,以在单图超级分辨率中的不同通道的相关性提取更准确的低频信息( SISR)。同样,ESRGCNN中的信号增强操作对于继承更长途上下文信息以解决长期依赖性也很有用。将自适应上采样操作收集到CNN中,以获得具有不同大小的低分辨率图像的图像超分辨率模型。广泛的实验报告说,我们的ESRGCNN在SISR中的SISR性能,复杂性,执行速度,图像质量评估和SISR的视觉效果方面超过了最先进的实验。代码可在https://github.com/hellloxiaotian/esrgcnn上找到。
translated by 谷歌翻译
基于常规卷积网络的视频超分辨率(VSR)方法具有很强的视频序列的时间建模能力。然而,在单向反复卷积网络中的不同反复单元接收的输入信息不平衡。早期重建帧接收较少的时间信息,导致模糊或工件效果。虽然双向反复卷积网络可以缓解这个问题,但它大大提高了重建时间和计算复杂性。它也不适用于许多应用方案,例如在线超分辨率。为了解决上述问题,我们提出了一种端到端信息预构建的经常性重建网络(IPRRN),由信息预构建网络(IPNet)和经常性重建网络(RRNET)组成。通过将足够的信息从视频的前面集成来构建初始复发单元所需的隐藏状态,以帮助恢复较早的帧,信息预构建的网络在不向后传播之前和之后的输入信息差异。此外,我们展示了一种紧凑的复发性重建网络,可显着改善恢复质量和时间效率。许多实验已经验证了我们所提出的网络的有效性,并与现有的最先进方法相比,我们的方法可以有效地实现更高的定量和定性评估性能。
translated by 谷歌翻译
本文提出了解码器 - 侧交叉分辨率合成(CRS)模块,以追求更好的压缩效率超出最新的通用视频编码(VVC),在那里我们在原始高分辨率(HR)处编码帧内帧,以较低的分辨率压缩帧帧间( LR),然后通过在先前的HR帧内和相邻的LR帧间帧内解解码LR帧间帧间帧帧。对于LR帧间帧,设计运动对准和聚合网络(MAN)以产生时间汇总的运动表示,以最佳保证时间平滑度;使用另一个纹理补偿网络(TCN)来生成从解码的HR帧内帧的纹理表示,以便更好地增强空间细节;最后,相似性驱动的融合引擎将运动和纹理表示合成为Upscale LR帧帧,以便去除压缩和分辨率重新采样噪声。我们使用所提出的CRS增强VVC,显示平均为8.76%和11.93%BJ {\ O} NTEGAARD Delta率(BD速率)分别在随机接入(RA)和低延延迟P(LDP)设置中的最新VVC锚点。此外,对基于最先进的超分辨率(SR)的VVC增强方法和消融研究的实验比较,进一步报告了所提出的算法的卓越效率和泛化。所有材料都将在HTTPS://njuvision.github.io /crs上公开进行可重复的研究。
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
卷积神经网络(CNN)通过深度体系结构获得了出色的性能。但是,这些CNN在复杂的场景下通常对图像超分辨率(SR)实现较差的鲁棒性。在本文中,我们通过利用不同类型的结构信息来获得高质量图像,提出了异质组SR CNN(HGSRCNN)。具体而言,HGSRCNN的每个异质组块(HGB)都采用含有对称组卷积块和互补的卷积块的异质体系结构,并以平行方式增强不同渠道的内部和外部关系,以促进富裕类型的较富裕类型的信息, 。为了防止出现获得的冗余功能,以串行方式具有信号增强功能的完善块旨在过滤无用的信息。为了防止原始信息的丢失,多级增强机制指导CNN获得对称架构,以促进HGSRCNN的表达能力。此外,开发了一种平行的向上采样机制来训练盲目的SR模型。广泛的实验表明,在定量和定性分析方面,提出的HGSRCNN获得了出色的SR性能。可以在https://github.com/hellloxiaotian/hgsrcnn上访问代码。
translated by 谷歌翻译
高光谱图像(HSI)没有额外辅助图像的超分辨率仍然是由于其高维光谱图案的恒定挑战,其中学习有效的空间和光谱表示是基本问题。最近,隐式的神经表示(INR)正在进行进步,作为新颖且有效的代表,特别是在重建任务中。因此,在这项工作中,我们提出了一种基于INR的新颖的HSI重建模型,其通过将空间坐标映射到其对应的光谱辐射值值的连续函数来表示HSI。特别地,作为INR的特定实现,参数模型的参数是通过使用卷积网络在特征提取的超通知来预测的。它使连续功能以内容感知方式将空间坐标映射到像素值。此外,周期性空间编码与重建过程深度集成,这使得我们的模型能够恢复更高的频率细节。为了验证我们模型的功效,我们在三个HSI数据集(洞穴,NUS和NTIRE2018)上进行实验。实验结果表明,与最先进的方法相比,该建议的模型可以实现竞争重建性能。此外,我们提供了对我们模型各个组件的效果的消融研究。我们希望本文可以服务器作为未来研究的效率参考。
translated by 谷歌翻译