最近的作品表明,卷积神经网络(CNN)架构具有朝向较低频率的光谱偏压,这已经针对在之前(DIP)框架中的深度图像中的各种图像恢复任务而被利用。归纳偏置的益处网络施加在DIP框架中取决于架构。因此,研究人员研究了如何自动化搜索来确定最佳性能的模型。然而,常见的神经结构搜索(NAS)技术是资源和时间密集的。此外,最佳性能的模型是针对整个图像的整个数据集而不是为每个图像独立地确定,这将是非常昂贵的。在这项工作中,我们首先表明DIP框架中的最佳神经结构是依赖于图像的。然后利用这种洞察力,我们提出了一种特定于DIP框架的图像特定的NAS策略,其需要比典型的NAS方法大得多,有效地实现特定于图像的NA。对于给定的图像,噪声被馈送到大量未训练的CNN,并且它们的输出的功率谱密度(PSD)与使用各种度量的损坏图像进行比较。基于此,选择并培训了一个小型的图像特定架构,以重建损坏的图像。在这种队列中,选择重建最接近重建图像的平均值的模型作为最终模型。我们向拟议的战略证明(1)证明其在NAS数据集上的表现效果,该数据集包括来自特定搜索空间(2)的500多种模型,在特定的搜索空间(2)上进行了广泛的图像去噪,染色和超级分辨率任务。我们的实验表明,图像特定度量可以将搜索空间减少到小型模型队列,其中最佳模型优于电流NAS用于图像恢复的方法。
translated by 谷歌翻译
Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to their ability to learn realistic image priors from a large number of example images. In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning. In order to do so, we show that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting. Furthermore, the same prior can be used to invert deep neural representations to diagnose them, and to restore images based on flash-no flash input pairs.
translated by 谷歌翻译
Neural Architecture Search (NAS) for automatically finding the optimal network architecture has shown some success with competitive performances in various computer vision tasks. However, NAS in general requires a tremendous amount of computations. Thus reducing computational cost has emerged as an important issue. Most of the attempts so far has been based on manual approaches, and often the architectures developed from such efforts dwell in the balance of the network optimality and the search cost. Additionally, recent NAS methods for image restoration generally do not consider dynamic operations that may transform dimensions of feature maps because of the dimensionality mismatch in tensor calculations. This can greatly limit NAS in its search for optimal network structure. To address these issues, we re-frame the optimal search problem by focusing at component block level. From previous work, it's been shown that an effective denoising block can be connected in series to further improve the network performance. By focusing at block level, the search space of reinforcement learning becomes significantly smaller and evaluation process can be conducted more rapidly. In addition, we integrate an innovative dimension matching modules for dealing with spatial and channel-wise mismatch that may occur in the optimal design search. This allows much flexibility in optimal network search within the cell block. With these modules, then we employ reinforcement learning in search of an optimal image denoising network at a module level. Computational efficiency of our proposed Denoising Prior Neural Architecture Search (DPNAS) was demonstrated by having it complete an optimal architecture search for an image restoration task by just one day with a single GPU.
translated by 谷歌翻译
深度图像先验表明,通过简单地优化它的参数来重建单个降级图像,可以训练具有合适架构的随机初始化网络以解决反向成像问题。但是,它受到了两个实际限制。首先,它仍然不清楚如何在网络架构选择之前控制。其次,培训需要Oracle停止标准,因为在优化期间,在达到最佳值后性能降低。为了解决这些挑战,我们引入频带对应度量以表征在之前的深图像的光谱偏压,其中低频图像信号比高频对应物更快且更好地学习。根据我们的观察,我们提出了防止最终性能下降和加速收敛的技术。我们介绍了Lipschitz受控的卷积层和高斯控制的上采样层,作为深度架构中使用的层的插件替代品。实验表明,随着这些变化,在优化期间,性能不会降低,从需要对Oracle停止标准的需求中脱离我们。我们进一步勾勒出停止标准以避免多余的计算。最后,我们表明我们的方法与各种去噪,去块,染色,超级分辨率和细节增强任务的当前方法相比获得了有利的结果。代码可用于\ url {https:/github.com/shizenglin/measure-and-control-spectraL-bias}。
translated by 谷歌翻译
基于深度学习的高光谱图像(HSI)恢复方法因其出色的性能而广受欢迎,但每当任务更改的细节时,通常都需要昂贵的网络再培训。在本文中,我们建议使用有效的插入方法以统一的方法恢复HSI,该方法可以共同保留基于优化方法的灵活性,并利用深神经网络的强大表示能力。具体而言,我们首先开发了一个新的深HSI DeNoiser,利用了门控复发单元,短期和长期的跳过连接以及增强的噪声水平图,以更好地利用HSIS内丰富的空间光谱信息。因此,这导致在高斯和复杂的噪声设置下,在HSI DeNosing上的最新性能。然后,在处理各种HSI恢复任务之前,将提议的DeNoiser插入即插即用的框架中。通过对HSI超分辨率,压缩感测和内部进行的广泛实验,我们证明了我们的方法经常实现卓越的性能,这与每个任务上的最先进的竞争性或甚至更好任何特定任务的培训。
translated by 谷歌翻译
插件播放(PNP)框架使得将高级图像deno的先验集成到优化算法中成为可能,以有效地解决通常以最大后验(MAP)估计问题为例的各种图像恢复任务。乘法乘数的交替方向方法(ADMM)和通过denoing(红色)算法的正则化是这类方法的两个示例,这些示例在图像恢复方面取得了突破。但是,尽管前一种方法仅适用于近端算法,但最近已经证明,当DeOisers缺乏Jacobian对称性时,没有任何正规化解释红色算法,这恰恰是最实际的DINOISERS的情况。据我们所知,没有任何方法来训练直接代表正规器梯度的网络,该网络可以直接用于基于插入梯度的算法中。我们表明,可以在共同训练相应的地图Denoiser的同时训练直接建模MAP正常化程序梯度的网络。我们在基于梯度的优化方法中使用该网络,并获得与其他通用插件方法相比,获得更好的结果。我们还表明,正规器可以用作展开梯度下降的预训练网络。最后,我们证明了由此产生的Denoiser允许更好地收敛插件ADMM。
translated by 谷歌翻译
随着深度学习(DL)的出现,超分辨率(SR)也已成为一个蓬勃发展的研究领域。然而,尽管结果有希望,但该领域仍然面临需要进一步研究的挑战,例如,允许灵活地采样,更有效的损失功能和更好的评估指标。我们根据最近的进步来回顾SR的域,并检查最新模型,例如扩散(DDPM)和基于变压器的SR模型。我们对SR中使用的当代策略进行了批判性讨论,并确定了有前途但未开发的研究方向。我们通过纳入该领域的最新发展,例如不确定性驱动的损失,小波网络,神经体系结构搜索,新颖的归一化方法和最新评估技术来补充先前的调查。我们还为整章中的模型和方法提供了几种可视化,以促进对该领域趋势的全球理解。最终,这篇综述旨在帮助研究人员推动DL应用于SR的界限。
translated by 谷歌翻译
有许多基于深卷卷神经网络(CNN)的图像恢复方法。但是,有关该主题的大多数文献都集中在网络体系结构和损失功能上,而对培训方法的详细介绍。因此,某些作品不容易重现,因为需要了解隐藏的培训技巧才能获得相同的结果。要具体说明培训数据集,很少有作品讨论了如何准备和订购培训图像补丁。此外,捕获新数据集以训练现实世界中的恢复网络需要高昂的成本。因此,我们认为有必要研究培训数据的准备和选择。在这方面,我们对训练贴片进行了分析,并探讨了不同斑块提取方法的后果。最终,我们提出了从给定训练图像中提取补丁的指南。
translated by 谷歌翻译
图像质量是一个模糊的概念,对不同的人不同的含义。为了量化图像质量,通常在损坏的图像和地面真实图像之间计算相对差异。但是我们应该使用哪些指标来测量这种差异?理想情况下,公制应对自然和科学图像表现良好。结构相似度指数(SSIM)是人类如何感知图像相似性的好措施,但对显微镜中科学有意义的差异不敏感。在电子和超分辨率显微镜中,经常使用傅里叶环相关(FRC),但在这些领域之外几乎是知名的。在这里,我们表明FRC同样可以应用于自然图像,例如自然图像。 Google打开图像数据集。然后,我们基于FRC定义了损失功能,表明它是在分析上可分的,并使用它来训练U-Net以用于去噪图像。这种基于FRC的损耗功能允许网络训练更快并达到与使用基于L1或L2的损失相似或更好的结果。我们还研究了通过FRC分析的神经网络去噪的性质和局限性。
translated by 谷歌翻译
盲人面部修复(BFR)旨在从低品质的图像中恢复高质量的面部图像,并通常求助于面部先验,以改善恢复性能。但是,当前的方法仍然遇到两个主要困难:1)如何在不进行大规模调整的情况下得出强大的网络体系结构; 2)如何从一个网络中的多个面部先验捕获互补信息以提高恢复性能。为此,我们提出了一个面部修复搜索网络(FRSNET),以适应我们指定的搜索空间内的合适特征提取体系结构,这可以直接有助于恢复质量。在FRSNET的基础上,我们通过多个学习方案进一步设计了多个面部先验搜索网络(MFPSNET)。 MFPSNET最佳地从不同的面部先验中提取信息,并将信息融合到图像特征中,以确保保留外部指导和内部特征。通过这种方式,MFPSNet充分利用了语义级别(解析图),几何级别(面部热图),参考级别(面部词典)和像素级(降级图像)信息,从而产生忠实且逼真的图像。定量和定性实验表明,MFPSNET在合成和现实世界数据集上对最先进的BFR方法表现出色。这些代码可公开可用:https://github.com/yyj1ang/mfpsnet。
translated by 谷歌翻译
现实世界图像Denoising是一个实用的图像恢复问题,旨在从野外嘈杂的输入中获取干净的图像。最近,Vision Transformer(VIT)表现出强大的捕获远程依赖性的能力,许多研究人员试图将VIT应用于图像DeNosing任务。但是,现实世界的图像是一个孤立的框架,它使VIT构建了内部贴片的远程依赖性,该依赖性将图像分为贴片并混乱噪声模式和梯度连续性。在本文中,我们建议通过使用连续的小波滑动转换器来解决此问题,该小波滑动转换器在现实世界中构建频率对应关系,称为dnswin。具体而言,我们首先使用CNN编码器从嘈杂的输入图像中提取底部功能。 DNSWIN的关键是将高频和低频信息与功能和构建频率依赖性分开。为此,我们提出了小波滑动窗口变压器,该变压器利用离散的小波变换,自我注意力和逆离散小波变换来提取深度特征。最后,我们使用CNN解码器将深度特征重建为DeNo的图像。对现实世界的基准测试的定量和定性评估都表明,拟议的DNSWIN对最新方法的表现良好。
translated by 谷歌翻译
多尺度体系结构和注意力模块在许多基于深度学习的图像脱落方法中都显示出有效性。但是,将这两个组件手动设计和集成到神经网络中需要大量的劳动力和广泛的专业知识。在本文中,高性能多尺度的细心神经体系结构搜索(MANAS)框架是技术开发的。所提出的方法为图像脱落任务的最爱的多个灵活模块制定了新的多尺度注意搜索空间。在搜索空间下,建立了多尺度的细胞,该单元被进一步用于构建功能强大的图像脱落网络。通过基于梯度的搜索算法自动搜索脱毛网络的内部多尺度架构,该算法在某种程度上避免了手动设计的艰巨过程。此外,为了获得强大的图像脱落模型,还提出了一种实用有效的多到一对训练策略,以允许去磨损网络从具有相同背景场景的多个雨天图像中获取足够的背景信息,与此同时,共同优化了包括外部损失,内部损失,建筑正则损失和模型复杂性损失在内的多个损失功能,以实现可靠的损伤性能和可控的模型复杂性。对合成和逼真的雨图像以及下游视觉应用(即反对检测和分割)的广泛实验结果始终证明了我们提出的方法的优越性。
translated by 谷歌翻译
尽管深度学习使图像介绍方面取得了巨大的飞跃,但当前的方法通常无法综合现实的高频细节。在本文中,我们建议将超分辨率应用于粗糙的重建输出,以高分辨率进行精炼,然后将输出降低到原始分辨率。通过将高分辨率图像引入改进网络,我们的框架能够重建更多的细节,这些细节通常由于光谱偏置而被平滑 - 神经网络倾向于比高频更好地重建低频。为了协助培训大型高度孔洞的改进网络,我们提出了一种渐进的学习技术,其中缺失区域的大小随着培训的进行而增加。我们的缩放,完善和缩放策略,结合了高分辨率的监督和渐进学习,构成了一种框架 - 不合时宜的方法,用于增强高频细节,可应用于任何基于CNN的涂层方法。我们提供定性和定量评估以及消融分析,以显示我们方法的有效性。这种看似简单但功能强大的方法优于最先进的介绍方法。我们的代码可在https://github.com/google/zoom-to-inpaint中找到
translated by 谷歌翻译
Neural Architecture Search (NAS) is an automatic technique that can search for well-performed architectures for a specific task. Although NAS surpasses human-designed architecture in many fields, the high computational cost of architecture evaluation it requires hinders its development. A feasible solution is to directly evaluate some metrics in the initial stage of the architecture without any training. NAS without training (WOT) score is such a metric, which estimates the final trained accuracy of the architecture through the ability to distinguish different inputs in the activation layer. However, WOT score is not an atomic metric, meaning that it does not represent a fundamental indicator of the architecture. The contributions of this paper are in three folds. First, we decouple WOT into two atomic metrics which represent the distinguishing ability of the network and the number of activation units, and explore better combination rules named (Distinguishing Activation Score) DAS. We prove the correctness of decoupling theoretically and confirmed the effectiveness of the rules experimentally. Second, in order to improve the prediction accuracy of DAS to meet practical search requirements, we propose a fast training strategy. When DAS is used in combination with the fast training strategy, it yields more improvements. Third, we propose a dataset called Darts-training-bench (DTB), which fills the gap that no training states of architecture in existing datasets. Our proposed method has 1.04$\times$ - 1.56$\times$ improvements on NAS-Bench-101, Network Design Spaces, and the proposed DTB.
translated by 谷歌翻译
The automated machine learning (AutoML) field has become increasingly relevant in recent years. These algorithms can develop models without the need for expert knowledge, facilitating the application of machine learning techniques in the industry. Neural Architecture Search (NAS) exploits deep learning techniques to autonomously produce neural network architectures whose results rival the state-of-the-art models hand-crafted by AI experts. However, this approach requires significant computational resources and hardware investments, making it less appealing for real-usage applications. This article presents the third version of Pareto-Optimal Progressive Neural Architecture Search (POPNASv3), a new sequential model-based optimization NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The algorithm employs Pareto optimality to reduce the number of architectures sampled during the search, drastically improving the time efficiency without loss in accuracy. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
translated by 谷歌翻译
大多数现有的神经结构搜索(NAS)算法专用于下游任务,例如计算机视觉中的图像分类。然而,广泛的实验表明,突出的神经架构,例如计算机视觉和自然语言处理中的LSTM中的reset,通常擅长从输入数据中提取模式并在不同的下游任务上执行良好。在本文中,我们试图回答与NAS相关的两个基本问题。 (1)是否有必要使用特定的下游任务的性能来评估和搜索良好的神经架构? (2)我们可以有效且有效地执行NAS,同时对下游任务无关吗?要回答这些问题,我们提出了一种新颖和通用NAS框架,称为通用NAS(Genna)。 Genna不使用特定于特定的标签,而是通过对架构评估的一组手动设计的合成信号基础采用回归。这种自我监督的回归任务可以有效地评估架构的内在力量以捕获和转换输入信号模式,并允许更多地使用训练样本。在13个CNN搜索空间和一个NLP空间中的大量实验和一个NLP空间在评估神经架构(通过近似性能与下游任务性能之间的排名相关Spearman的RON)和收敛速度之间的rho(通过排名相关Spearman的Rho量化)来证明GennaS的显着效率培训(几秒钟内)。
translated by 谷歌翻译
Single Image Super-Resolution (SISR) tasks have achieved significant performance with deep neural networks. However, the large number of parameters in CNN-based met-hods for SISR tasks require heavy computations. Although several efficient SISR models have been recently proposed, most are handcrafted and thus lack flexibility. In this work, we propose a novel differentiable Neural Architecture Search (NAS) approach on both the cell-level and network-level to search for lightweight SISR models. Specifically, the cell-level search space is designed based on an information distillation mechanism, focusing on the combinations of lightweight operations and aiming to build a more lightweight and accurate SR structure. The network-level search space is designed to consider the feature connections among the cells and aims to find which information flow benefits the cell most to boost the performance. Unlike the existing Reinforcement Learning (RL) or Evolutionary Algorithm (EA) based NAS methods for SISR tasks, our search pipeline is fully differentiable, and the lightweight SISR models can be efficiently searched on both the cell-level and network-level jointly on a single GPU. Experiments show that our methods can achieve state-of-the-art performance on the benchmark datasets in terms of PSNR, SSIM, and model complexity with merely 68G Multi-Adds for $\times 2$ and 18G Multi-Adds for $\times 4$ SR tasks.
translated by 谷歌翻译
This work targets designing a principled and unified training-free framework for Neural Architecture Search (NAS), with high performance, low cost, and in-depth interpretation. NAS has been explosively studied to automate the discovery of top-performer neural networks, but suffers from heavy resource consumption and often incurs search bias due to truncated training or approximations. Recent NAS works start to explore indicators that can predict a network's performance without training. However, they either leveraged limited properties of deep networks, or the benefits of their training-free indicators are not applied to more extensive search methods. By rigorous correlation analysis, we present a unified framework to understand and accelerate NAS, by disentangling "TEG" characteristics of searched networks - Trainability, Expressivity, Generalization - all assessed in a training-free manner. The TEG indicators could be scaled up and integrated with various NAS search methods, including both supernet and single-path approaches. Extensive studies validate the effective and efficient guidance from our TEG-NAS framework, leading to both improved search accuracy and over 56% reduction in search time cost. Moreover, we visualize search trajectories on three landscapes of "TEG" characteristics, observing that while a good local minimum is easier to find on NAS-Bench-201 given its simple topology, balancing "TEG" characteristics is much harder on the DARTS search space due to its complex landscape geometry. Our code is available at https://github.com/VITA-Group/TEGNAS.
translated by 谷歌翻译
Learning a good image prior is a long-term goal for image restoration and manipulation. While existing methods like deep image prior (DIP) capture low-level image statistics, there are still gaps toward an image prior that captures rich image semantics including color, spatial coherence, textures, and high-level concepts. This work presents an effective way to exploit the image prior captured by a generative adversarial network (GAN) trained on large-scale natural images. As shown in Fig. 1, the deep generative prior (DGP) provides compelling results to restore missing semantics, e.g., color, patch, resolution, of various degraded images. It also enables diverse image manipulation including random jittering, image morphing, and category transfer. Such highly flexible restoration and manipulation are made possible through relaxing the assumption of existing GAN-inversion methods, which tend to fix the generator. Notably, we allow the generator to be fine-tuned on-the-fly in a progressive manner regularized by feature distance obtained by the discriminator in GAN. We show that these easy-to-implement and practical changes help preserve the reconstruction to remain in the manifold of nature image, and thus lead to more precise and faithful reconstruction for real images. Code is available at https://github.com/XingangPan/deepgenerative-prior.
translated by 谷歌翻译
虽然可分辨率的架构搜索(飞镖)已成为神经结构中的主流范例(NAS),因为其简单和效率,最近的作品发现,搜索架构的性能几乎可以随着飞镖的优化程序而增加,以及最终的大小由飞镖获得几乎无法表明运营的重要性。上述观察表明,飞镖中的监督信号可能是架构搜索的穷人或不可靠的指标,鼓励有趣和有趣的方向:我们可以衡量不可分辨率范式下的任何培训的运作重要性吗?我们通过在初始化问题的网络修剪中定制NAS提供肯定的答案。随着最近建议的突触突触效力标准在初始化的网络修剪中,我们寻求在没有任何培训的情况下将候选人行动中的候选人行动的重要性进行评分,并提出了一种名为“免费可分辨的架构搜索}(Freedarts)的小说框架” 。我们表明,没有任何培训,具有不同代理度量的自由路由器可以在不同的搜索空间中优于大多数NAS基线。更重要的是,Freedarts是非常内存的高效和计算效率,因为它放弃了架构搜索阶段的培训,使得能够在更灵活的空间上执行架构搜索并消除架构搜索和评估之间的深度间隙。我们希望我们的工作激励从初始化修剪的角度来激发解决NAS的尝试。
translated by 谷歌翻译