本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
本文回顾了关于压缩视频质量增强质量的第一个NTIRE挑战,重点是拟议的方法和结果。在此挑战中,采用了新的大型不同视频(LDV)数据集。挑战有三个曲目。Track 1和2的目标是增强HEVC在固定QP上压缩的视频,而Track 3旨在增强X265压缩的视频,以固定的位速率压缩。此外,轨道1和3的质量提高了提高保真度(PSNR)的目标,以及提高感知质量的2个目标。这三个曲目完全吸引了482个注册。在测试阶段,分别提交了12个团队,8支球队和11支球队,分别提交了轨道1、2和3的最终结果。拟议的方法和解决方案衡量视频质量增强的最先进。挑战的首页:https://github.com/renyang-home/ntire21_venh
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
随着移动平台上对计算摄影和成像的需求不断增长,在相机系统中开发和集成了高级图像传感器与新型算法的发展。但是,缺乏用于研究的高质量数据以及从行业和学术界进行深入交流的难得的机会限制了移动智能摄影和成像(MIPI)的发展。为了弥合差距,我们介绍了第一个MIPI挑战,包括五个曲目,这些曲目着重于新型图像传感器和成像算法。在本文中,我们总结并审查了MIPI 2022上的分配摄像头(UDC)图像恢复轨道。总共,成功注册了167名参与者,并在最终测试阶段提交了19个团队。在这项挑战中开发的解决方案在播放摄像头映像修复局上实现了最新的性能。本文提供了此挑战中所有模型的详细描述。有关此挑战的更多详细信息以及数据集的链接,请访问https://github.com/mipi-challenge/mipi2022。
translated by 谷歌翻译
近年来,压缩图像超分辨率已引起了极大的关注,其中图像被压缩伪像和低分辨率伪影降解。由于复杂的杂化扭曲变形,因此很难通过简单的超分辨率和压缩伪像消除掉的简单合作来恢复扭曲的图像。在本文中,我们向前迈出了一步,提出了层次的SWIN变压器(HST)网络,以恢复低分辨率压缩图像,该图像共同捕获分层特征表示并分别用SWIN Transformer增强每个尺度表示。此外,我们发现具有超分辨率(SR)任务的预处理对于压缩图像超分辨率至关重要。为了探索不同的SR预审查的影响,我们将常用的SR任务(例如,比科比奇和不同的实际超分辨率仿真)作为我们的预处理任务,并揭示了SR在压缩的图像超分辨率中起不可替代的作用。随着HST和预训练的合作,我们的HST在AIM 2022挑战中获得了低质量压缩图像超分辨率轨道的第五名,PSNR为23.51db。广泛的实验和消融研究已经验证了我们提出的方法的有效性。
translated by 谷歌翻译
This paper reviews the first challenge on single image super-resolution (restoration of rich details in an low resolution image) with focus on proposed solutions and results.A new DIVerse 2K resolution image dataset (DIV2K) was employed. The challenge had 6 competitions divided into 2 tracks with 3 magnification factors each. Track 1 employed the standard bicubic downscaling setup, while Track 2 had unknown downscaling operators (blur kernel and decimation) but learnable through low and high res train images. Each competition had ∼ 100 registered participants and 20 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.
translated by 谷歌翻译
基于变压器的方法与基于CNN的方法相比,由于其对远程依赖性的模型,因此获得了令人印象深刻的图像恢复性能。但是,像Swinir这样的进步采用了基于窗口的和本地注意力的策略来平衡性能和计算开销,这限制了采用大型接收领域来捕获全球信息并在早期层中建立长期依赖性。为了进一步提高捕获全球信息的效率,在这项工作中,我们建议Swinfir通过更换具有整个图像范围的接收场的快速傅立叶卷积(FFC)组件来扩展Swinir。我们还重新访问其他先进技术,即数据增强,预训练和功能集合,以改善图像重建的效果。并且我们的功能合奏方法使模型的性能得以大大增强,而无需增加训练和测试时间。与现有方法相比,我们将算法应用于多个流行的大规模基准,并实现了最先进的性能。例如,我们的Swinfir在漫画109数据集上达到了32.83 dB的PSNR,该PSNR比最先进的Swinir方法高0.8 dB。
translated by 谷歌翻译
随着移动设备的普及,例如智能手机和可穿戴设备,更轻,更快的型号对于应用视频超级分辨率至关重要。但是,大多数以前的轻型模型倾向于集中于减少台式GPU模型推断的范围,这在当前的移动设备中可能不会节能。在本文中,我们提出了极端低功率超级分辨率(ELSR)网络,该网络仅在移动设备中消耗少量的能量。采用预训练和填充方法来提高极小模型的性能。广泛的实验表明,我们的方法在恢复质量和功耗之间取得了良好的平衡。最后,我们在目标总经理Dimenty 9000 PlantForm上,PSNR 27.34 dB和功率为0.09 w/30fps的竞争分数为90.9,在移动AI&AIM 2022实时视频超级分辨率挑战中排名第一。
translated by 谷歌翻译
压缩在通过限制系统(例如流媒体服务,虚拟现实或视频游戏)等系统的有效传输和存储图像和视频中起着重要作用。但是,不可避免地会导致伪影和原始信息的丢失,这可能会严重降低视觉质量。由于这些原因,压缩图像的质量增强已成为流行的研究主题。尽管大多数最先进的图像恢复方法基于卷积神经网络,但基于Swinir等其他基于变压器的方法在这些任务上表现出令人印象深刻的性能。在本文中,我们探索了新型的Swin Transformer V2,以改善图像超分辨率的Swinir,尤其是压缩输入方案。使用这种方法,我们可以解决训练变压器视觉模型中的主要问题,例如训练不稳定性,预训练和微调之间的分辨率差距以及数据饥饿。我们对三个代表性任务进行实验:JPEG压缩伪像去除,图像超分辨率(经典和轻巧)以及压缩的图像超分辨率。实验结果表明,我们的方法SWIN2SR可以改善SWINIR的训练收敛性和性能,并且是“ AIM 2022挑战压缩图像和视频的超分辨率”的前5个解决方案。
translated by 谷歌翻译
本文报告了NTIRE 2022关于感知图像质量评估(IQA)的挑战,并与CVPR 2022的图像恢复和增强研讨会(NTIRE)研讨会(NTIRE)讲习班的新趋势举行。感知图像处理算法。这些算法的输出图像与传统扭曲具有完全不同的特征,并包含在此挑战中使用的PIP数据集中。这个挑战分为两条曲目,一个类似于以前的NTIRE IQA挑战的全参考IQA轨道,以及一条侧重于No-Reference IQA方法的新曲目。挑战有192和179名注册参与者的两条曲目。在最后的测试阶段,有7和8个参与的团队提交了模型和事实表。几乎所有这些都比现有的IQA方法取得了更好的结果,并且获胜方法可以证明最先进的性能。
translated by 谷歌翻译
Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt / 30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
translated by 谷歌翻译
本文提出了解码器 - 侧交叉分辨率合成(CRS)模块,以追求更好的压缩效率超出最新的通用视频编码(VVC),在那里我们在原始高分辨率(HR)处编码帧内帧,以较低的分辨率压缩帧帧间( LR),然后通过在先前的HR帧内和相邻的LR帧间帧内解解码LR帧间帧间帧帧。对于LR帧间帧,设计运动对准和聚合网络(MAN)以产生时间汇总的运动表示,以最佳保证时间平滑度;使用另一个纹理补偿网络(TCN)来生成从解码的HR帧内帧的纹理表示,以便更好地增强空间细节;最后,相似性驱动的融合引擎将运动和纹理表示合成为Upscale LR帧帧,以便去除压缩和分辨率重新采样噪声。我们使用所提出的CRS增强VVC,显示平均为8.76%和11.93%BJ {\ O} NTEGAARD Delta率(BD速率)分别在随机接入(RA)和低延延迟P(LDP)设置中的最新VVC锚点。此外,对基于最先进的超分辨率(SR)的VVC增强方法和消融研究的实验比较,进一步报告了所提出的算法的卓越效率和泛化。所有材料都将在HTTPS://njuvision.github.io /crs上公开进行可重复的研究。
translated by 谷歌翻译
Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge [26].
translated by 谷歌翻译
Reference-based image super-resolution (RefSR) is a promising SR branch and has shown great potential in overcoming the limitations of single image super-resolution. While previous state-of-the-art RefSR methods mainly focus on improving the efficacy and robustness of reference feature transfer, it is generally overlooked that a well reconstructed SR image should enable better SR reconstruction for its similar LR images when it is referred to as. Therefore, in this work, we propose a reciprocal learning framework that can appropriately leverage such a fact to reinforce the learning of a RefSR network. Besides, we deliberately design a progressive feature alignment and selection module for further improving the RefSR task. The newly proposed module aligns reference-input images at multi-scale feature spaces and performs reference-aware feature selection in a progressive manner, thus more precise reference features can be transferred into the input features and the network capability is enhanced. Our reciprocal learning paradigm is model-agnostic and it can be applied to arbitrary RefSR models. We empirically show that multiple recent state-of-the-art RefSR models can be consistently improved with our reciprocal learning paradigm. Furthermore, our proposed model together with the reciprocal learning strategy sets new state-of-the-art performances on multiple benchmarks.
translated by 谷歌翻译
随着卷积神经网络最近的大规模发展,已经提出了用于边缘设备上实用部署的大量基于CNN的显着图像超分辨率方法。但是,大多数现有方法都集中在一个特定方面:网络或损失设计,这导致难以最大程度地减少模型大小。为了解决这个问题,我们得出结论,设计,架构搜索和损失设计,以获得更有效的SR结构。在本文中,我们提出了一个名为EFDN的边缘增强功能蒸馏网络,以保留在约束资源下的高频信息。详细说明,我们基于现有的重新处理方法构建了一个边缘增强卷积块。同时,我们提出了边缘增强的梯度损失,以校准重新分配的路径训练。实验结果表明,我们的边缘增强策略可以保持边缘并显着提高最终恢复质量。代码可在https://github.com/icandle/efdn上找到。
translated by 谷歌翻译
面部超分辨率(FSR),也称为面部幻觉,其旨在增强低分辨率(LR)面部图像以产生高分辨率(HR)面部图像的分辨率,是特定于域的图像超分辨率问题。最近,FSR获得了相当大的关注,并目睹了深度学习技术的发展炫目。迄今为止,有很少有基于深入学习的FSR的研究摘要。在本次调查中,我们以系统的方式对基于深度学习的FSR方法进行了全面审查。首先,我们总结了FSR的问题制定,并引入了流行的评估度量和损失功能。其次,我们详细说明了FSR中使用的面部特征和流行数据集。第三,我们根据面部特征的利用大致分类了现有方法。在每个类别中,我们从设计原则的一般描述开始,然后概述代表方法,然后讨论其中的利弊。第四,我们评估了一些最先进的方法的表现。第五,联合FSR和其他任务以及与FSR相关的申请大致介绍。最后,我们设想了这一领域进一步的技术进步的前景。在\ URL {https://github.com/junjun-jiang/face-hallucination-benchmark}上有一个策划的文件和资源的策划文件和资源清单
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
Convolutional neural networks have recently demonstrated high-quality reconstruction for single-image superresolution. In this paper, we propose the Laplacian Pyramid Super-Resolution Network (LapSRN) to progressively reconstruct the sub-band residuals of high-resolution images. At each pyramid level, our model takes coarse-resolution feature maps as input, predicts the high-frequency residuals, and uses transposed convolutions for upsampling to the finer level. Our method does not require the bicubic interpolation as the pre-processing step and thus dramatically reduces the computational complexity. We train the proposed LapSRN with deep supervision using a robust Charbonnier loss function and achieve high-quality reconstruction. Furthermore, our network generates multi-scale predictions in one feed-forward pass through the progressive reconstruction, thereby facilitates resource-aware applications. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of speed and accuracy.
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译