基于深度学习的单图像超分辨率(SISR)方法引起了人们的关注,并在现代高级GPU上取得了巨大的成功。但是,大多数最先进的方法都需要大量参数,记忆和计算资源,这些参数通常会显示在当前移动设备CPU/NPU上时显示出较低的推理时间。在本文中,我们提出了一个简单的普通卷积网络,该网络具有快速最近的卷积模块(NCNET),该模块对NPU友好,可以实时执行可靠的超级分辨率。提出的最近的卷积具有与最近的UP采样相同的性能,但更快,更适合Android NNAPI。我们的模型可以很容易地在具有8位量化的移动设备上部署,并且与所有主要的移动AI加速器完全兼容。此外,我们对移动设备上的不同张量操作进行了全面的实验,以说明网络体系结构的效率。我们的NCNET在DIV2K 3X数据集上进行了训练和验证,并且与其他有效的SR方法的比较表明,NCNET可以实现高保真SR结果,同时使用更少的推理时间。我们的代码和预估计的模型可在\ url {https://github.com/algolzw/ncnet}上公开获得。
translated by 谷歌翻译
视频超分辨率(VSR)是从一系列低分辨率输入序列恢复高分辨率帧的任务。与单图超分辨率不同,VSR可以利用框架的时间信息来重建结果,并提供更多详细信息。最近,随着卷积神经网络(CNN)的快速发展,VSR任务引起了人们的关注,许多基于CNN的方法取得了显着的结果。但是,由于计算资源和运行时限制,只能将一些VSR方法应用于现实世界移动设备。在本文中,我们提出了一个\ textIt {基于滑动窗口的重复网络}(SWRN),该网络可以实时推断,同时仍能达到卓越的性能。具体而言,我们注意到视频帧应该具有可以帮助恢复细节的空间和时间关系,而关键点是如何提取和汇总信息。解决它,我们输入了三个相邻的帧,并利用隐藏状态来重复存储和更新重要的时间信息。我们在REDS数据集上的实验表明,所提出的方法可以很好地适应移动设备并产生视觉上令人愉悦的结果。
translated by 谷歌翻译
我们为移动设备提出了一个轻巧的单图超分辨率网络,名为XCAT。XCAT引入了具有交叉串联(HXBLOCK)的异质群卷积块。输入通道向组卷积块的异质拆分减少了操作数量,交叉串联允许在级联HXBlocks的中间输入张量之间进行信息流。HXBlocks内部的交叉串联也可以避免使用更昂贵的操作,例如1x1卷积。为了进一步预见昂贵的张量副本操作,XCAT利用不可训练的卷积内核来应用采样操作。XCAT考虑了整数量化的设计,还利用了几种技术,例如基于强度的数据增强。Integer的XCAT量化XCAT可在320ms的Mali-G71 MP2 GPU上实时运行,以及适用于实时应用的30ms(NCHW)和8.8ms(NHWC)的Synaptics Dolphin NPU。
translated by 谷歌翻译
随着移动设备的普及,例如智能手机和可穿戴设备,更轻,更快的型号对于应用视频超级分辨率至关重要。但是,大多数以前的轻型模型倾向于集中于减少台式GPU模型推断的范围,这在当前的移动设备中可能不会节能。在本文中,我们提出了极端低功率超级分辨率(ELSR)网络,该网络仅在移动设备中消耗少量的能量。采用预训练和填充方法来提高极小模型的性能。广泛的实验表明,我们的方法在恢复质量和功耗之间取得了良好的平衡。最后,我们在目标总经理Dimenty 9000 PlantForm上,PSNR 27.34 dB和功率为0.09 w/30fps的竞争分数为90.9,在移动AI&AIM 2022实时视频超级分辨率挑战中排名第一。
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
In recent years, deep learning methods have been successfully applied to single-image super-resolution tasks. Despite their great performances, deep learning methods cannot be easily applied to realworld applications due to the requirement of heavy computation. In this paper, we address this issue by proposing an accurate and lightweight deep network for image super-resolution. In detail, we design an architecture that implements a cascading mechanism upon a residual network. We also present variant models of the proposed cascading residual network to further improve efficiency. Our extensive experiments show that even with much fewer parameters and operations, our models achieve performance comparable to that of state-of-the-art methods.
translated by 谷歌翻译
Single Image Super-Resolution (SISR) tasks have achieved significant performance with deep neural networks. However, the large number of parameters in CNN-based met-hods for SISR tasks require heavy computations. Although several efficient SISR models have been recently proposed, most are handcrafted and thus lack flexibility. In this work, we propose a novel differentiable Neural Architecture Search (NAS) approach on both the cell-level and network-level to search for lightweight SISR models. Specifically, the cell-level search space is designed based on an information distillation mechanism, focusing on the combinations of lightweight operations and aiming to build a more lightweight and accurate SR structure. The network-level search space is designed to consider the feature connections among the cells and aims to find which information flow benefits the cell most to boost the performance. Unlike the existing Reinforcement Learning (RL) or Evolutionary Algorithm (EA) based NAS methods for SISR tasks, our search pipeline is fully differentiable, and the lightweight SISR models can be efficiently searched on both the cell-level and network-level jointly on a single GPU. Experiments show that our methods can achieve state-of-the-art performance on the benchmark datasets in terms of PSNR, SSIM, and model complexity with merely 68G Multi-Adds for $\times 2$ and 18G Multi-Adds for $\times 4$ SR tasks.
translated by 谷歌翻译
随着卷积神经网络最近的大规模发展,已经提出了用于边缘设备上实用部署的大量基于CNN的显着图像超分辨率方法。但是,大多数现有方法都集中在一个特定方面:网络或损失设计,这导致难以最大程度地减少模型大小。为了解决这个问题,我们得出结论,设计,架构搜索和损失设计,以获得更有效的SR结构。在本文中,我们提出了一个名为EFDN的边缘增强功能蒸馏网络,以保留在约束资源下的高频信息。详细说明,我们基于现有的重新处理方法构建了一个边缘增强卷积块。同时,我们提出了边缘增强的梯度损失,以校准重新分配的路径训练。实验结果表明,我们的边缘增强策略可以保持边缘并显着提高最终恢复质量。代码可在https://github.com/icandle/efdn上找到。
translated by 谷歌翻译
单像超分辨率可以在需要可靠的视觉流以监视任务,处理远程操作或研究相关视觉细节的环境中支持机器人任务。在这项工作中,我们为实时超级分辨率提出了一个有效的生成对抗网络模型。我们采用了原始SRGAN的量身定制体系结构和模型量化,以提高CPU和Edge TPU设备上的执行,最多达到200 fps的推断。我们通过将其知识提炼成较小版本的网络,进一步优化我们的模型,并与标准培训方法相比获得显着的改进。我们的实验表明,与较重的最新模型相比,我们的快速和轻量级模型可保持相当令人满意的图像质量。最后,我们对图像传输进行带宽降解的实验,以突出提出的移动机器人应用系统的优势。
translated by 谷歌翻译
This paper reviews the first challenge on single image super-resolution (restoration of rich details in an low resolution image) with focus on proposed solutions and results.A new DIVerse 2K resolution image dataset (DIV2K) was employed. The challenge had 6 competitions divided into 2 tracks with 3 magnification factors each. Track 1 employed the standard bicubic downscaling setup, while Track 2 had unknown downscaling operators (blur kernel and decimation) but learnable through low and high res train images. Each competition had ∼ 100 registered participants and 20 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.
translated by 谷歌翻译
最近的卷积神经网络(CNN)的改进 - 基于单图像超分辨率(SISR)方法严重依赖于制造网络架构,而不是发现除了简单地降低回归损耗之外的合适的培训算法。调整知识蒸馏(KD)可以开辟一种方法,以便对SISR进行进一步改进,并且在模型效率方面也是有益的。 KD是一种模型压缩方法,可提高深神经网络(DNN)的性能而不使用其他参数进行测试。它最近越来越敏捷,以提供更好的能力性能权衡。在本文中,我们提出了一种适用于SISR的新型特征蒸馏(FD)方法。我们展示了基于FITNET的FD方法的局限性,它在SISR任务中受到影响,并建议修改现有的FD算法以专注于本地特征信息。此外,我们提出了一种基于教师 - 学生差异的软特征注意方法,其选择性地专注于特定的像素位置以提取特征信息。我们致电我们的方法本地选择性特征蒸馏(LSFD)并验证我们的方法在SISR问题中优于传统的FD方法。
translated by 谷歌翻译
具有强大学习能力的CNN被广泛选择以解决超分辨率问题。但是,CNN依靠更深的网络体系结构来提高图像超分辨率的性能,这可能会增加计算成本。在本文中,我们提出了一个增强的超分辨率组CNN(ESRGCNN),具有浅层架构,通过完全融合了深层和宽的通道特征,以在单图超级分辨率中的不同通道的相关性提取更准确的低频信息( SISR)。同样,ESRGCNN中的信号增强操作对于继承更长途上下文信息以解决长期依赖性也很有用。将自适应上采样操作收集到CNN中,以获得具有不同大小的低分辨率图像的图像超分辨率模型。广泛的实验报告说,我们的ESRGCNN在SISR中的SISR性能,复杂性,执行速度,图像质量评估和SISR的视觉效果方面超过了最先进的实验。代码可在https://github.com/hellloxiaotian/esrgcnn上找到。
translated by 谷歌翻译
随着深度学习(DL)的出现,超分辨率(SR)也已成为一个蓬勃发展的研究领域。然而,尽管结果有希望,但该领域仍然面临需要进一步研究的挑战,例如,允许灵活地采样,更有效的损失功能和更好的评估指标。我们根据最近的进步来回顾SR的域,并检查最新模型,例如扩散(DDPM)和基于变压器的SR模型。我们对SR中使用的当代策略进行了批判性讨论,并确定了有前途但未开发的研究方向。我们通过纳入该领域的最新发展,例如不确定性驱动的损失,小波网络,神经体系结构搜索,新颖的归一化方法和最新评估技术来补充先前的调查。我们还为整章中的模型和方法提供了几种可视化,以促进对该领域趋势的全球理解。最终,这篇综述旨在帮助研究人员推动DL应用于SR的界限。
translated by 谷歌翻译
Recently, great progress has been made in single-image super-resolution (SISR) based on deep learning technology. However, the existing methods usually require a large computational cost. Meanwhile, the activation function will cause some features of the intermediate layer to be lost. Therefore, it is a challenge to make the model lightweight while reducing the impact of intermediate feature loss on the reconstruction quality. In this paper, we propose a Feature Interaction Weighted Hybrid Network (FIWHN) to alleviate the above problem. Specifically, FIWHN consists of a series of novel Wide-residual Distillation Interaction Blocks (WDIB) as the backbone, where every third WDIBs form a Feature shuffle Weighted Group (FSWG) by mutual information mixing and fusion. In addition, to mitigate the adverse effects of intermediate feature loss on the reconstruction results, we introduced a well-designed Wide Convolutional Residual Weighting (WCRW) and Wide Identical Residual Weighting (WIRW) units in WDIB, and effectively cross-fused features of different finenesses through a Wide-residual Distillation Connection (WRDC) framework and a Self-Calibrating Fusion (SCF) unit. Finally, to complement the global features lacking in the CNN model, we introduced the Transformer into our model and explored a new way of combining the CNN and Transformer. Extensive quantitative and qualitative experiments on low-level and high-level tasks show that our proposed FIWHN can achieve a good balance between performance and efficiency, and is more conducive to downstream tasks to solve problems in low-pixel scenarios.
translated by 谷歌翻译
已经证明了深度卷积神经网络近年来对SISR有效。一方面,已经广泛使用了残余连接和密集连接,以便于前向信息和向后梯度流动以提高性能。然而,当前方法以次优的方式在大多数网络层中单独使用残留连接和密集连接。另一方面,虽然各种网络和方法旨在改善计算效率,节省参数或利用彼此的多种比例因子的训练数据来提升性能,但它可以在人力资源空间中进行超级分辨率来具有高计算成本或不能在不同尺度因子的模型之间共享参数以节省参数和推理时间。为了解决这些挑战,我们提出了一种使用双路径连接的高效单图像超分辨率网络,其多种规模学习命名为EMSRDPN。通过将双路径的双路径连接引入EMSRDPN,它在大多数网络层中以综合方式使用残差连接和密集连接。双路径连接具有重用残余连接的共同特征和探索密集连接的新功能,以了解SISR的良好代表。要利用多种比例因子的特征相关性,EMSRDPN在不同缩放因子之间共享LR空间中的所有网络单元,以学习共享功能,并且仅使用单独的重建单元进行每个比例因子,这可以利用多种规模因子的培训数据来帮助各个规模因素另外提高性能,同时可以节省参数并支持共享推理,以提高效率的多种规模因素。实验显示EMSRDPN通过SOTA方法实现更好的性能和比较或更好的参数和推理效率。
translated by 谷歌翻译
近年来,使用基于深入学习的架构的状态,在图像超分辨率的任务中有几个进步。先前发布的许多基于超分辨率的技术,需要高端和顶部的图形处理单元(GPU)来执行图像超分辨率。随着深度学习方法的进步越来越大,神经网络已经变得越来越多地计算饥饿。我们返回了一步,并专注于创建实时有效的解决方案。我们提出了一种在其内存足迹方面更快更小的架构。所提出的架构使用深度明智的可分离卷积来提取特征,并且它与其他超分辨率的GAN(生成对抗网络)进行接受,同时保持实时推断和低存储器占用。即使在带宽条件不佳,实时超分辨率也能够流式传输高分辨率介质内容。在维持准确性和延迟之间的有效权衡之间,我们能够生产可比较的性能模型,该性能模型是超分辨率GAN的大小的一个 - 八(1/8),并且计算的速度比超分辨率的GAN快74倍。
translated by 谷歌翻译
The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
translated by 谷歌翻译
Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt / 30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Informative features play a crucial role in the single image super-resolution task. Channel attention has been demonstrated to be effective for preserving information-rich features in each layer. However, channel attention treats each convolution layer as a separate process that misses the correlation among different layers. To address this problem, we propose a new holistic attention network (HAN), which consists of a layer attention module (LAM) and a channel-spatial attention module (CSAM), to model the holistic interdependencies among layers, channels, and positions. Specifically, the proposed LAM adaptively emphasizes hierarchical features by considering correlations among layers. Meanwhile, CSAM learns the confidence at all the positions of each channel to selectively capture more informative features. Extensive experiments demonstrate that the proposed HAN performs favorably against the state-ofthe-art single image super-resolution approaches.
translated by 谷歌翻译
自从Dong等人的第一个成功以来,基于深度学习的方法已在单像超分辨率领域中占主导地位。这取代了使用深神经网络的传统基于稀疏编码方法的所有手工图像处理步骤。与明确创建高/低分辨率词典的基于稀疏编码的方法相反,基于深度学习的方法中的词典被隐式地作为多种卷积的非线性组合被隐式获取。基于深度学习方法的缺点是,它们的性能因与训练数据集(室外图像)不同的图像而降低。我们提出了一个带有深层字典(SRDD)的端到端超分辨率网络,在该网络中,高分辨率词典在不牺牲深度学习优势的情况下明确学习。广泛的实验表明,高分辨率词典的显式学习使网络在维持内域测试图像的性能的同时更加强大。
translated by 谷歌翻译