批量标准化(BN)等归一化是一个里程碑技术,用于将深度学习中中间层的分布标准化,从而实现更快的培训和更好的泛化精度。然而,在保真图像超分辨率(SR)中,据信归一化层通过归一化特征来摆脱范围灵活性,并且它们被从现代SR网络中删除。在本文中,我们定量和定性地研究了这种现象。我们发现,在归一化层后,残差特征的标准偏差会缩小,这导致SR网络中的性能下降。标准偏差反映了像素值的变化量。当变化变小时,边缘将变得较少识别网络来解决。为了解决这个问题,我们提出了一种自适应偏差调制器(ADADM),其中自适应地预测调制因子以放大像素偏差。为了更好的泛化性能,我们使用所提出的Adadm在最先进的SR网络中应用BN。同时,ADADM中的偏差放大策略使得更可区分的特征中的边缘信息。因此,具有BN和我们的ADAD的SR网络可以在基准数据集中获得实质性的性能改进。已经进行了广泛的实验以表明我们方法的有效性。
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译
Convolutional neural network (CNN) depth is of crucial importance for image super-resolution (SR). However, we observe that deeper networks for image SR are more difficult to train. The lowresolution inputs and features contain abundant low-frequency information, which is treated equally across channels, hence hindering the representational ability of CNNs. To solve these problems, we propose the very deep residual channel attention networks (RCAN). Specifically, we propose a residual in residual (RIR) structure to form very deep network, which consists of several residual groups with long skip connections. Each residual group contains some residual blocks with short skip connections. Meanwhile, RIR allows abundant low-frequency information to be bypassed through multiple skip connections, making the main network focus on learning high-frequency information. Furthermore, we propose a channel attention mechanism to adaptively rescale channel-wise features by considering interdependencies among channels. Extensive experiments show that our RCAN achieves better accuracy and visual improvements against state-of-the-art methods.
translated by 谷歌翻译
A very deep convolutional neural network (CNN) has recently achieved great success for image super-resolution (SR) and offered hierarchical features as well. However, most deep CNN based SR models do not make full use of the hierarchical features from the original low-resolution (LR) images, thereby achieving relatively-low performance. In this paper, we propose a novel residual dense network (RDN) to address this problem in image SR. We fully exploit the hierarchical features from all the convolutional layers. Specifically, we propose residual dense block (RDB) to extract abundant local features via dense connected convolutional layers. RDB further allows direct connections from the state of preceding RDB to all the layers of current RDB, leading to a contiguous memory (CM) mechanism. Local feature fusion in RDB is then used to adaptively learn more effective features from preceding and current local features and stabilizes the training of wider network. After fully obtaining dense local features, we use global feature fusion to jointly and adaptively learn global hierarchical features in a holistic way. Experiments on benchmark datasets with different degradation models show that our RDN achieves favorable performance against state-of-the-art methods.
translated by 谷歌翻译
Recently, Convolutional Neural Network (CNN) based models have achieved great success in Single Image Super-Resolution (SISR). Owing to the strength of deep networks, these CNN models learn an effective nonlinear mapping from the low-resolution input image to the high-resolution target image, at the cost of requiring enormous parameters. This paper proposes a very deep CNN model (up to 52 convolutional layers) named Deep Recursive Residual Network (DRRN) that strives for deep yet concise networks. Specifically, residual learning is adopted, both in global and local manners, to mitigate the difficulty of training very deep net-works; recursive learning is used to control the model parameters while increasing the depth. Extensive benchmark evaluation shows that DRRN significantly outperforms state of the art in SISR, while utilizing far fewer parameters. Code is available at https://github.com/tyshiwo /DRRN CVPR17.
translated by 谷歌翻译
Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge [26].
translated by 谷歌翻译
Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from lowquality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14∼0.45dB, while the total number of parameters can be reduced by up to 67%.
translated by 谷歌翻译
Image super-resolution (SR) serves as a fundamental tool for the processing and transmission of multimedia data. Recently, Transformer-based models have achieved competitive performances in image SR. They divide images into fixed-size patches and apply self-attention on these patches to model long-range dependencies among pixels. However, this architecture design is originated for high-level vision tasks, which lacks design guideline from SR knowledge. In this paper, we aim to design a new attention block whose insights are from the interpretation of Local Attribution Map (LAM) for SR networks. Specifically, LAM presents a hierarchical importance map where the most important pixels are located in a fine area of a patch and some less important pixels are spread in a coarse area of the whole image. To access pixels in the coarse area, instead of using a very large patch size, we propose a lightweight Global Pixel Access (GPA) module that applies cross-attention with the most similar patch in an image. In the fine area, we use an Intra-Patch Self-Attention (IPSA) module to model long-range pixel dependencies in a local patch, and then a $3\times3$ convolution is applied to process the finest details. In addition, a Cascaded Patch Division (CPD) strategy is proposed to enhance perceptual quality of recovered images. Extensive experiments suggest that our method outperforms state-of-the-art lightweight SR methods by a large margin. Code is available at https://github.com/passerer/HPINet.
translated by 谷歌翻译
卷积神经网络在过去十年中允许在单个图像超分辨率(SISR)中的显着进展。在SISR最近的进展中,关注机制对于高性能SR模型至关重要。但是,注意机制仍然不清楚为什么它在SISR中的工作原理。在这项工作中,我们试图量化和可视化SISR中的注意力机制,并表明并非所有关注模块都同样有益。然后,我们提出了关注网络(A $ ^ 2 $ n)的注意力,以获得更高效和准确的SISR。具体来说,$ ^ 2 $ n包括非关注分支和耦合注意力分支。提出了一种动态注意力模块,为这两个分支产生权重,以动态地抑制不需要的注意力调整,其中权重根据输入特征自适应地改变。这允许注意模块专门从事惩罚的有益实例,从而大大提高了注意力网络的能力,即几个参数开销。实验结果表明,我们的最终模型A $ ^ 2 $ n可以实现与类似尺寸的最先进网络相比的卓越的权衡性能。代码可以在https://github.com/haoyuc/a2n获得。
translated by 谷歌翻译
The Super-Resolution Generative Adversarial Network (SR-GAN) [1] is a seminal work that is capable of generating realistic textures during single image super-resolution. However, the hallucinated details are often accompanied with unpleasant artifacts. To further enhance the visual quality, we thoroughly study three key components of SRGANnetwork architecture, adversarial loss and perceptual loss, and improve each of them to derive an Enhanced SRGAN (ESRGAN). In particular, we introduce the Residual-in-Residual Dense Block (RRDB) without batch normalization as the basic network building unit. Moreover, we borrow the idea from relativistic GAN [2] to let the discriminator predict relative realness instead of the absolute value. Finally, we improve the perceptual loss by using the features before activation, which could provide stronger supervision for brightness consistency and texture recovery. Benefiting from these improvements, the proposed ESRGAN achieves consistently better visual quality with more realistic and natural textures than SRGAN and won the first place in the PIRM2018-SR Challenge 1 [3]. The code is available at https://github.com/xinntao/ESRGAN.
translated by 谷歌翻译
单像超分辨率(SISR),作为传统的不良反对问题,通过最近的卷积神经网络(CNN)的发展得到了极大的振兴。这些基于CNN的方法通常将低分辨率图像映射到其相应的高分辨率版本,具有复杂的网络结构和损耗功能,显示出令人印象深刻的性能。本文对传统的SISR算法提供了新的洞察力,并提出了一种基本上不同的方法,依赖于迭代优化。提出了一种新颖的迭代超分辨率网络(ISRN),顶部是迭代优化。我们首先分析图像SR问题的观察模型,通过以更一般和有效的方式模仿和融合每次迭代来激发可行的解决方案。考虑到批量归一化的缺点,我们提出了一种特征归一化(F-NOM,FN)方法来调节网络中的功能。此外,开发了一种具有FN的新颖块以改善作为FNB称为FNB的网络表示。剩余剩余结构被提出形成一个非常深的网络,其中FNBS与长时间跳过连接,以获得更好的信息传递和稳定训练阶段。对BICUBIC(BI)降解的测试基准的广泛实验结果表明我们的ISRN不仅可以恢复更多的结构信息,而且还可以获得竞争或更好的PSNR / SSIM结果,与其他作品相比,参数更少。除BI之外,我们除了模拟模糊(BD)和低级噪声(DN)的实际降级。 ISRN及其延伸ISRN +两者都比使用BD和DN降级模型的其他产品更好。
translated by 谷歌翻译
近年来,压缩图像超分辨率已引起了极大的关注,其中图像被压缩伪像和低分辨率伪影降解。由于复杂的杂化扭曲变形,因此很难通过简单的超分辨率和压缩伪像消除掉的简单合作来恢复扭曲的图像。在本文中,我们向前迈出了一步,提出了层次的SWIN变压器(HST)网络,以恢复低分辨率压缩图像,该图像共同捕获分层特征表示并分别用SWIN Transformer增强每个尺度表示。此外,我们发现具有超分辨率(SR)任务的预处理对于压缩图像超分辨率至关重要。为了探索不同的SR预审查的影响,我们将常用的SR任务(例如,比科比奇和不同的实际超分辨率仿真)作为我们的预处理任务,并揭示了SR在压缩的图像超分辨率中起不可替代的作用。随着HST和预训练的合作,我们的HST在AIM 2022挑战中获得了低质量压缩图像超分辨率轨道的第五名,PSNR为23.51db。广泛的实验和消融研究已经验证了我们提出的方法的有效性。
translated by 谷歌翻译
单个图像超分辨率(SISR)是一个不良问题,旨在获得从低分辨率(LR)输入的高分辨率(HR)输出,在此期间应该添加额外的高频信息以改善感知质量。现有的SISR工作主要通过最小化平均平方重建误差来在空间域中运行。尽管高峰峰值信噪比(PSNR)结果,但难以确定模型是否正确地添加所需的高频细节。提出了一些基于基于残余的结构,以指导模型暗示高频率特征。然而,由于空间域度量的解释是有限的,如何验证这些人为细节的保真度仍然是一个问题。在本文中,我们提出了频率域视角来的直观管道,解决了这个问题。由现有频域的工作启发,我们将图像转换为离散余弦变换(DCT)块,然后改革它们以获取DCT功能映射,它用作我们模型的输入和目标。设计了专门的管道,我们进一步提出了符合频域任务的性质的频率损失功能。我们的SISR方法在频域中可以明确地学习高频信息,为SR图像提供保真度和良好的感知质量。我们进一步观察到我们的模型可以与其他空间超分辨率模型合并,以提高原始SR输出的质量。
translated by 谷歌翻译
基于变压器的方法与基于CNN的方法相比,由于其对远程依赖性的模型,因此获得了令人印象深刻的图像恢复性能。但是,像Swinir这样的进步采用了基于窗口的和本地注意力的策略来平衡性能和计算开销,这限制了采用大型接收领域来捕获全球信息并在早期层中建立长期依赖性。为了进一步提高捕获全球信息的效率,在这项工作中,我们建议Swinfir通过更换具有整个图像范围的接收场的快速傅立叶卷积(FFC)组件来扩展Swinir。我们还重新访问其他先进技术,即数据增强,预训练和功能集合,以改善图像重建的效果。并且我们的功能合奏方法使模型的性能得以大大增强,而无需增加训练和测试时间。与现有方法相比,我们将算法应用于多个流行的大规模基准,并实现了最先进的性能。例如,我们的Swinfir在漫画109数据集上达到了32.83 dB的PSNR,该PSNR比最先进的Swinir方法高0.8 dB。
translated by 谷歌翻译
自从Dong等人的第一个成功以来,基于深度学习的方法已在单像超分辨率领域中占主导地位。这取代了使用深神经网络的传统基于稀疏编码方法的所有手工图像处理步骤。与明确创建高/低分辨率词典的基于稀疏编码的方法相反,基于深度学习的方法中的词典被隐式地作为多种卷积的非线性组合被隐式获取。基于深度学习方法的缺点是,它们的性能因与训练数据集(室外图像)不同的图像而降低。我们提出了一个带有深层字典(SRDD)的端到端超分辨率网络,在该网络中,高分辨率词典在不牺牲深度学习优势的情况下明确学习。广泛的实验表明,高分辨率词典的显式学习使网络在维持内域测试图像的性能的同时更加强大。
translated by 谷歌翻译
In recent years, deep learning methods have been successfully applied to single-image super-resolution tasks. Despite their great performances, deep learning methods cannot be easily applied to realworld applications due to the requirement of heavy computation. In this paper, we address this issue by proposing an accurate and lightweight deep network for image super-resolution. In detail, we design an architecture that implements a cascading mechanism upon a residual network. We also present variant models of the proposed cascading residual network to further improve efficiency. Our extensive experiments show that even with much fewer parameters and operations, our models achieve performance comparable to that of state-of-the-art methods.
translated by 谷歌翻译
已经证明了深度卷积神经网络近年来对SISR有效。一方面,已经广泛使用了残余连接和密集连接,以便于前向信息和向后梯度流动以提高性能。然而,当前方法以次优的方式在大多数网络层中单独使用残留连接和密集连接。另一方面,虽然各种网络和方法旨在改善计算效率,节省参数或利用彼此的多种比例因子的训练数据来提升性能,但它可以在人力资源空间中进行超级分辨率来具有高计算成本或不能在不同尺度因子的模型之间共享参数以节省参数和推理时间。为了解决这些挑战,我们提出了一种使用双路径连接的高效单图像超分辨率网络,其多种规模学习命名为EMSRDPN。通过将双路径的双路径连接引入EMSRDPN,它在大多数网络层中以综合方式使用残差连接和密集连接。双路径连接具有重用残余连接的共同特征和探索密集连接的新功能,以了解SISR的良好代表。要利用多种比例因子的特征相关性,EMSRDPN在不同缩放因子之间共享LR空间中的所有网络单元,以学习共享功能,并且仅使用单独的重建单元进行每个比例因子,这可以利用多种规模因子的培训数据来帮助各个规模因素另外提高性能,同时可以节省参数并支持共享推理,以提高效率的多种规模因素。实验显示EMSRDPN通过SOTA方法实现更好的性能和比较或更好的参数和推理效率。
translated by 谷歌翻译
通过利用大型内核分解和注意机制,卷积神经网络(CNN)可以在许多高级计算机视觉任务中与基于变压器的方法竞争。但是,由于远程建模的优势,具有自我注意力的变压器仍然主导着低级视野,包括超分辨率任务。在本文中,我们提出了一个基于CNN的多尺度注意网络(MAN),该网络由多尺度的大内核注意力(MLKA)和一个封闭式的空间注意单元(GSAU)组成,以提高卷积SR网络的性能。在我们的MLKA中,我们使用多尺度和栅极方案纠正LKA,以在各种粒度水平上获得丰富的注意图,从而共同汇总了全局和局部信息,并避免了潜在的阻塞伪像。在GSAU中,我们集成了栅极机制和空间注意力,以消除不必要的线性层和汇总信息丰富的空间环境。为了确认我们的设计的有效性,我们通过简单地堆叠不同数量的MLKA和GSAU来评估具有多种复杂性的人。实验结果表明,我们的人可以在最先进的绩效和计算之间实现各种权衡。代码可从https://github.com/icandle/man获得。
translated by 谷歌翻译
在本文中,我们提出了D2C-SR,这是一个新颖的框架,用于实现现实世界图像超级分辨率的任务。作为一个不适的问题,超分辨率相关任务的关键挑战是给定的低分辨率输入可能会有多个预测。大多数基于经典的深度学习方法都忽略了基本事实,缺乏对基础高频分布的明确建模,从而导致结果模糊。最近,一些基于GAN或学习的超分辨率空间的方法可以生成模拟纹理,但不能保证具有低定量性能的纹理的准确性。重新思考这两者,我们以离散形式了解了基本高频细节的分布,并提出了两阶段的管道:分歧阶段到收敛阶段。在发散阶段,我们提出了一个基于树的结构深网作为差异骨干。提出了发散损失,以鼓励基于树的网络产生的结果,以分解可能的高频表示,这是我们对基本高频分布进行离散建模的方式。在收敛阶段,我们分配空间权重以融合这些不同的预测,以获得更准确的细节,以获取最终输出。我们的方法为推理提供了方便的端到端方式。我们对几个现实世界基准进行评估,包括具有X8缩放系数的新提出的D2CrealSR数据集。我们的实验表明,D2C-SR针对最先进的方法实现了更好的准确性和视觉改进,参数编号明显较少,并且我们的D2C结构也可以作为广义结构应用于其他一些方法以获得改进。我们的代码和数据集可在https://github.com/megvii-research/d2c-sr上找到
translated by 谷歌翻译
将低分辨率(LR)图像恢复到超分辨率(SR)图像具有正确和清晰的细节是挑战。现有的深度学习工作几乎忽略了图像的固有结构信息,这是对SR结果的视觉感知的重要作用。在本文中,我们将分层特征开发网络设计为探测并以多尺度特征融合方式保持结构信息。首先,我们提出了在传统边缘探测器上的交叉卷积,以定位和代表边缘特征。然后,交叉卷积块(CCBS)设计有功能归一化和渠道注意,以考虑特征的固有相关性。最后,我们利用多尺度特征融合组(MFFG)来嵌入交叉卷积块,并在层次的层次上开发不同尺度的结构特征的关系,调用名为Cross-SRN的轻量级结构保护网络。实验结果表明,交叉SRN通过准确且清晰的结构细节实现了对最先进的方法的竞争或卓越的恢复性能。此外,我们设置了一个标准,以选择具有丰富的结构纹理的图像。所提出的跨SRN优于所选择的基准测试的最先进的方法,这表明我们的网络在保存边缘具有显着的优势。
translated by 谷歌翻译