在过去的几十年中,已经进行了许多尝试来解决从其相应的低分辨率(LR)对应物中恢复高分辨率(HR)面部形象的问题,这是通常被称为幻觉的任务。尽管通过位置补丁和基于深度学习的方法实现了令人印象深刻的性能,但大多数技术仍然无法恢复面孔的特定特定功能。前一组算法通常在存在更高水平的降解存在下产生模糊和过天气输出,而后者产生的面部有时绝不使得输入图像中的个体类似于个体。在本文中,将引入一种新的面部超分辨率方法,其中幻觉面被迫位于可用训练面跨越的子空间中。因此,与大多数现有面的幻觉技术相比,由于这种面部子空间之前,重建是为了回收特定人的面部特征,而不是仅仅增加图像定量分数。此外,通过最近的3D面部重建领域的进步启发,还呈现了一种有效的3D字典对齐方案,通过该方案,该算法能够处理在不受控制的条件下拍摄的低分辨率面。在几个众所周知的面部数据集上进行的广泛实验中,所提出的算法通过生成详细和接近地面真理结果来显示出色的性能,这在定量和定性评估中通过显着的边距来实现了最先进的面部幻觉算法。
translated by 谷歌翻译
面部超分辨率(FSR),也称为面部幻觉,其旨在增强低分辨率(LR)面部图像以产生高分辨率(HR)面部图像的分辨率,是特定于域的图像超分辨率问题。最近,FSR获得了相当大的关注,并目睹了深度学习技术的发展炫目。迄今为止,有很少有基于深入学习的FSR的研究摘要。在本次调查中,我们以系统的方式对基于深度学习的FSR方法进行了全面审查。首先,我们总结了FSR的问题制定,并引入了流行的评估度量和损失功能。其次,我们详细说明了FSR中使用的面部特征和流行数据集。第三,我们根据面部特征的利用大致分类了现有方法。在每个类别中,我们从设计原则的一般描述开始,然后概述代表方法,然后讨论其中的利弊。第四,我们评估了一些最先进的方法的表现。第五,联合FSR和其他任务以及与FSR相关的申请大致介绍。最后,我们设想了这一领域进一步的技术进步的前景。在\ URL {https://github.com/junjun-jiang/face-hallucination-benchmark}上有一个策划的文件和资源的策划文件和资源清单
translated by 谷歌翻译
Face Restoration (FR) aims to restore High-Quality (HQ) faces from Low-Quality (LQ) input images, which is a domain-specific image restoration problem in the low-level computer vision area. The early face restoration methods mainly use statistic priors and degradation models, which are difficult to meet the requirements of real-world applications in practice. In recent years, face restoration has witnessed great progress after stepping into the deep learning era. However, there are few works to study deep learning-based face restoration methods systematically. Thus, this paper comprehensively surveys recent advances in deep learning techniques for face restoration. Specifically, we first summarize different problem formulations and analyze the characteristic of the face image. Second, we discuss the challenges of face restoration. Concerning these challenges, we present a comprehensive review of existing FR methods, including prior based methods and deep learning-based methods. Then, we explore developed techniques in the task of FR covering network architectures, loss functions, and benchmark datasets. We also conduct a systematic benchmark evaluation on representative methods. Finally, we discuss future directions, including network designs, metrics, benchmark datasets, applications,etc. We also provide an open-source repository for all the discussed methods, which is available at https://github.com/TaoWangzj/Awesome-Face-Restoration.
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
最近的深面幻觉方法显示出令人惊叹的超级分辨面部图像,甚至超过人类能力。但是,这些算法主要在非公共合成数据集上评估。因此,尚不清楚这些算法如何在公共面幻觉数据集上执行。同时,大多数现有数据集都不太考虑种族的分布,这使得在这些数据集上训练的面部幻觉方法偏向于某些特定种族。为了解决上述两个问题,在本文中,我们构建了一个公共种族多样化的面部数据集,Edface-Celeb-1M,并设计了面部幻觉的基准任务。我们的数据集包括170万张覆盖不同国家 /地区的照片,并具有平衡的种族组成。据我们所知,它是野外最大且公开的面部幻觉数据集。与该数据集相关联,本文还贡献了各种评估协议,并提供了全面的分析,以基于现有的最新方法。基准评估证明了最新算法的性能和局限性。
translated by 谷歌翻译
横梁面部识别(CFR)旨在识别个体,其中比较面部图像源自不同的感测模式,例如红外与可见的。虽然CFR由于与模态差距相关的面部外观的显着变化,但CFR具有比经典的面部识别更具挑战性,但它在具有有限或挑战的照明的场景中,以及在呈现攻击的情况下,它是优越的。与卷积神经网络(CNNS)相关的人工智能最近的进展使CFR的显着性能提高了。由此激励,这项调查的贡献是三倍。我们提供CFR的概述,目标是通过首先正式化CFR然后呈现具体相关的应用来比较不同光谱中捕获的面部图像。其次,我们探索合适的谱带进行识别和讨论最近的CFR方法,重点放在神经网络上。特别是,我们提出了提取和比较异构特征以及数据集的重新访问技术。我们枚举不同光谱和相关算法的优势和局限性。最后,我们讨论了研究挑战和未来的研究线。
translated by 谷歌翻译
Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important role in human life such as device unlock, mobile payment, and so on. This paper provides an introduction to face recognition, including its history, pipeline, algorithms based on conventional manually designed features or deep learning, mainstream training, evaluation datasets, and related applications. We have analyzed and compared state-of-the-art works as many as possible, and also carefully designed a set of experiments to find the effect of backbone size and data distribution. This survey is a material of the tutorial named The Practical Face Recognition Technology in the Industrial World in the FG2023.
translated by 谷歌翻译
本文介绍了在混合高斯 - 突破噪声条件下重建高分辨率(HR)LF图像的GPU加速计算框架。主要重点是考虑处理速度和重建质量的高性能方法。从统计的角度来看,我们得出了一个联合$ \ ell^1 $ - $ \ ell^2 $数据保真度,用于惩罚人力资源重建错误,考虑到混合噪声情况。对于正则化,我们采用了加权非本地总变异方法,这使我们能够通过适当的加权方案有效地实现LF图像。我们表明,乘数算法(ADMM)的交替方向方法可用于简化计算复杂性,并在GPU平台上导致高性能并行计算。对合成4D LF数据集和自然图像数据集进行了广泛的实验,以验证提出的SR模型的鲁棒性并评估加速优化器的性能。实验结果表明,与最先进的方法相比,我们的方法在严重的混合噪声条件下实现了更好的重建质量。此外,提议的方法克服了处理大规模SR任务的先前工作的局限性。虽然适合单个现成的GPU,但建议的加速器提供的平均加速度为2.46 $ \ times $和1.57 $ \ times $,分别为$ \ times 2 $和$ \ times 3 $ SR任务。此外,与CPU执行相比,达到$ 77 \ times $的加速。
translated by 谷歌翻译
Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce over-smoothed images that lack highfrequency textures and do not look natural despite yielding high PSNR values.We propose a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.
translated by 谷歌翻译
在实践中,图像可以包含不同颜色通道的不同噪声,这不受现有的超分辨率方法确认。在本文中,我们通过关注颜色通道来提出超声噪音图像。噪声统计从输入的低分辨率图像盲目地估计,并且用于以数据成本为不同颜色信道分配不同权重。通过与自适应权重相关联的核规范最小化,通过核标准最小化强制强制执行视觉数据的隐式低秩结构,这将作为正则化术语添加到成本中。另外,通过涉及投影到PCA的另一个正则化术语将图像的多尺度细节添加到模型中,该术语是使用在输入图像的不同尺度上提取的类似斑块构造的。结果展示了在实际方案中的方法的超声解决能力。
translated by 谷歌翻译
尽管基准数据集的成功,但大多数先进的面部超分辨率模型在真实情况下表现不佳,因为真实图像与合成训练对之间的显着域间隙。为了解决这个问题,我们提出了一种用于野外面部超分辨率的新型域 - 自适应降级网络。该降级网络预测流场以及中间低分辨率图像。然后,通过翘曲中间图像来生成降级的对应物。利用捕获运动模糊的偏好,这种模型在保护原始图像和劣化之间保持身份一致性更好地执行。我们进一步提出了超分辨率网络的自我调节块。该块将输入图像作为条件术语,以有效地利用面部结构信息,从而消除了对显式前沿的依赖性,例如,面部地标或边界。我们的模型在Celeba和真实世界的面部数据集上实现了最先进的性能。前者展示了我们所提出的建筑的强大生成能力,而后者展示了现实世界中的良好的身份一致性和感知品质。
translated by 谷歌翻译
Image Super-Resolution (SR) is essential for a wide range of computer vision and image processing tasks. Investigating infrared (IR) image (or thermal images) super-resolution is a continuing concern within the development of deep learning. This survey aims to provide a comprehensive perspective of IR image super-resolution, including its applications, hardware imaging system dilemmas, and taxonomy of image processing methodologies. In addition, the datasets and evaluation metrics in IR image super-resolution tasks are also discussed. Furthermore, the deficiencies in current technologies and possible promising directions for the community to explore are highlighted. To cope with the rapid development in this field, we intend to regularly update the relevant excellent work at \url{https://github.com/yongsongH/Infrared_Image_SR_Survey
translated by 谷歌翻译
盲人恢复通常会遇到各种规模的面孔输入,尤其是在现实世界中。但是,当前的大多数作品都支持特定的规模面,这限制了其在现实情况下的应用能力。在这项工作中,我们提出了一个新颖的尺度感知盲人面部修复框架,名为FaceFormer,该框架将面部特征恢复作为比例感知转换。所提出的面部特征上采样(FFUP)模块基于原始的比例比例动态生成UPSMPLING滤波器,这有助于我们的网络适应任意面部尺度。此外,我们进一步提出了面部特征嵌入(FFE)模块,该模块利用变压器来层次提取面部潜在的多样性和鲁棒性。因此,我们的脸部形式实现了富裕性和稳健性,恢复了面部的面孔,对面部成分具有现实和对称的细节。广泛的实验表明,我们提出的使用合成数据集训练的方法比当前的最新图像更好地推广到天然低质量的图像。
translated by 谷歌翻译
从单个图像重建高保真3D面部纹理是一个具有挑战性的任务,因为缺乏完整的面部信息和3D面和2D图像之间的域间隙。最新作品通过应用基于代或基于重建的方法来解决面部纹理重建问题。尽管各种方法具有自身的优势,但它们不能恢复高保真和可重新可传送的面部纹理,其中术语“重新可调剂”要求面部质地在空间地完成和与环境照明中脱颖而出。在本文中,我们提出了一种新颖的自我监督学习框架,用于从野外的单视图重建高质量的3D面。我们的主要思想是首先利用先前的一代模块来生产先前的Albedo,然后利用细节细化模块来获得详细的Albedo。为了进一步使面部纹理解开照明,我们提出了一种新颖的详细的照明表示,该表现在一起与详细的Albedo一起重建。我们还在反照侧和照明方面设计了几种正规化损失功能,以便于解散这两个因素。最后,由于可怜的渲染技术,我们的神经网络可以以自我监督的方式有效地培训。关于具有挑战性的数据集的广泛实验表明,我们的框架在定性和定量比较方面显着优于最先进的方法。
translated by 谷歌翻译
The primary aim of single-image super-resolution is to construct a high-resolution (HR) image from a corresponding low-resolution (LR) input. In previous approaches, which have generally been supervised, the training objective typically measures a pixel-wise average distance between the super-resolved (SR) and HR images. Optimizing such metrics often leads to blurring, especially in high variance (detailed) regions. We propose an alternative formulation of the super-resolution problem based on creating realistic SR images that downscale correctly. We present a novel super-resolution algorithm addressing this problem, PULSE (Photo Upsampling via Latent Space Exploration), which generates high-resolution, realistic images at resolutions previously unseen in the literature. It accomplishes this in an entirely self-supervised fashion and is not confined to a specific degradation operator used during training, unlike previous methods (which require training on databases of LR-HR image pairs for supervised learning). Instead of starting with the LR image and slowly adding detail, PULSE traverses the high-resolution natural image manifold, searching for images that downscale to the original LR image. This is formalized through the "downscaling loss," which guides exploration through the latent space of a generative model. By leveraging properties of high-dimensional Gaussians, we restrict the search space to guarantee that our outputs are realistic. PULSE thereby generates super-resolved images that both are realistic and downscale correctly. We show extensive experimental results demonstrating the efficacy of our approach in the domain of face super-resolution (also known as face hallucination). We also present a discussion of the limitations and biases of the method as currently implemented with an accompanying model card with relevant metrics. Our method outperforms state-of-the-art methods in perceptual quality at higher resolutions and scale factors than previously pos-sible.
translated by 谷歌翻译
在本文中,我们探讨了一个有趣的问题,即从$ 8 \ times8 $ Pixel视频序列中获得什么。令人惊讶的是,事实证明很多。我们表明,当我们处理此$ 8 \ times8 $视频带有正确的音频和图像先验时,我们可以获得全长的256 \ times256 $视频。我们使用新颖的视听UPPRAPLING网络实现了极低分辨率输入的$ 32 \ times $缩放。音频先验有助于恢复元素面部细节和精确的唇形,而单个高分辨率目标身份图像先验为我们提供了丰富的外观细节。我们的方法是端到端的多阶段框架。第一阶段会产生一个粗糙的中间输出视频,然后可用于动画单个目标身份图像并生成逼真,准确和高质量的输出。我们的方法很简单,并且与以前的超分辨率方法相比,表现非常好($ 8 \ times $改善了FID得分)。我们还将模型扩展到了谈话视频压缩,并表明我们在以前的最新时间上获得了$ 3.5 \ times $的改进。通过广泛的消融实验(在论文和补充材料中)对我们网络的结果进行了彻底的分析。我们还在我们的网站上提供了演示视频以及代码和模型:\ url {http://cvit.iiit.ac.in/research/project/projects/cvit-projects/talking-face-vace-video-upsmpling}。
translated by 谷歌翻译
盲图修复(IR)是计算机视觉中常见但充满挑战的问题。基于经典模型的方法和最新的深度学习(DL)方法代表了有关此问题的两种不同方法,每种方法都有自己的优点和缺点。在本文中,我们提出了一种新颖的盲图恢复方法,旨在整合它们的两种优势。具体而言,我们为盲IR构建了一个普通的贝叶斯生成模型,该模型明确描绘了降解过程。在此提出的模型中,PICEL的非I.I.D。高斯分布用于适合图像噪声。它的灵活性比简单的I.I.D。在大多数常规方法中采用的高斯或拉普拉斯分布,以处理图像降解中包含的更复杂的噪声类型。为了解决该模型,我们设计了一个变异推理算法,其中所有预期的后验分布都被参数化为深神经网络,以提高其模型能力。值得注意的是,这种推论算法诱导统一的框架共同处理退化估计和图像恢复的任务。此外,利用了前一种任务中估计的降解信息来指导后一种红外过程。对两项典型的盲型IR任务进行实验,即图像降解和超分辨率,表明所提出的方法比当前最新的方法实现了卓越的性能。
translated by 谷歌翻译
虽然最近基于模型的盲目单图像超分辨率(SISR)的研究已经取得了巨大的成功,但大多数人都不认为图像劣化。首先,它们总是假设图像噪声obeys独立和相同分布的(i.i.d.)高斯或拉普拉斯分布,这在很大程度上低估了真实噪音的复杂性。其次,以前的常用核前沿(例如,归一化,稀疏性)不足以保证理性内核解决方案,从而退化后续SISR任务的性能。为了解决上述问题,本文提出了一种基于模型的盲人SISR方法,该方法在概率框架下,从噪声和模糊内核的角度精心模仿图像劣化。具体而言,而不是传统的i.i.d.噪声假设,基于补丁的非i.i.d。提出噪声模型来解决复杂的真实噪声,期望增加噪声表示模型的自由度。至于模糊内核,我们新建构建一个简洁但有效的内核生成器,并将其插入所提出的盲人SISR方法作为明确的内核(EKP)。为了解决所提出的模型,专门设计了理论上接地的蒙特卡罗EM算法。综合实验证明了我们对综合性和实时数据集的最新技术的方法的优越性。
translated by 谷歌翻译
Convolutional Neural Network (CNN)-based image super-resolution (SR) has exhibited impressive success on known degraded low-resolution (LR) images. However, this type of approach is hard to hold its performance in practical scenarios when the degradation process is unknown. Despite existing blind SR methods proposed to solve this problem using blur kernel estimation, the perceptual quality and reconstruction accuracy are still unsatisfactory. In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model. We propose a components decomposition and co-optimization network (CDCN) for blind SR. Firstly, CDCN decomposes the input LR image into structure and detail components in feature space. Then, the mutual collaboration block (MCB) is presented to exploit the relationship between both two components. In this way, the detail component can provide informative features to enrich the structural context and the structure component can carry structural context for better detail revealing via a mutual complementary manner. After that, we present a degradation-driven learning strategy to jointly supervise the HR image detail and structure restoration process. Finally, a multi-scale fusion module followed by an upsampling layer is designed to fuse the structure and detail features and perform SR reconstruction. Empowered by such degradation-based components decomposition, collaboration, and mutual optimization, we can bridge the correlation between component learning and degradation modelling for blind SR, thereby producing SR results with more accurate textures. Extensive experiments on both synthetic SR datasets and real-world images show that the proposed method achieves the state-of-the-art performance compared to existing methods.
translated by 谷歌翻译
Deconvolution is a widely used strategy to mitigate the blurring and noisy degradation of hyperspectral images~(HSI) generated by the acquisition devices. This issue is usually addressed by solving an ill-posed inverse problem. While investigating proper image priors can enhance the deconvolution performance, it is not trivial to handcraft a powerful regularizer and to set the regularization parameters. To address these issues, in this paper we introduce a tuning-free Plug-and-Play (PnP) algorithm for HSI deconvolution. Specifically, we use the alternating direction method of multipliers (ADMM) to decompose the optimization problem into two iterative sub-problems. A flexible blind 3D denoising network (B3DDN) is designed to learn deep priors and to solve the denoising sub-problem with different noise levels. A measure of 3D residual whiteness is then investigated to adjust the penalty parameters when solving the quadratic sub-problems, as well as a stopping criterion. Experimental results on both simulated and real-world data with ground-truth demonstrate the superiority of the proposed method.
translated by 谷歌翻译