In recent years, multi-scale generative adversarial networks (GANs) have been proposed to build generalized image processing models based on single sample. Constraining on the sample size, multi-scale GANs have much difficulty converging to the global optimum, which ultimately leads to limitations in their capabilities. In this paper, we pioneered the introduction of PAC-Bayes generalized bound theory into the training analysis of specific models under different adversarial training methods, which can obtain a non-vacuous upper bound on the generalization error for the specified multi-scale GAN structure. Based on the drastic changes we found of the generalization error bound under different adversarial attacks and different training states, we proposed an adaptive training method which can greatly improve the image manipulation ability of multi-scale GANs. The final experimental results show that our adaptive training method in this paper has greatly contributed to the improvement of the quality of the images generated by multi-scale GANs on several image manipulation tasks. In particular, for the image super-resolution restoration task, the multi-scale GAN model trained by the proposed method achieves a 100% reduction in natural image quality evaluator (NIQE) and a 60% reduction in root mean squared error (RMSE), which is better than many models trained on large-scale datasets.
translated by 谷歌翻译
Random samples from a single image Single training image Figure 1: Image generation learned from a single training image. We propose SinGAN-a new unconditional generative model trained on a single natural image. Our model learns the image's patch statistics across multiple scales, using a dedicated multi-scale adversarial training scheme; it can then be used to generate new realistic image samples that preserve the original patch distribution while creating new object configurations and structures.
translated by 谷歌翻译
对抗性的例子揭示了神经网络的脆弱性和不明原因的性质。研究对抗性实例的辩护具有相当大的实际重要性。大多数逆势的例子,错误分类网络通常无法被人类不可检测。在本文中,我们提出了一种防御模型,将分类器培训成具有形状偏好的人类感知分类模型。包括纹理传输网络(TTN)和辅助防御生成的对冲网络(GAN)的所提出的模型被称为人类感知辅助防御GaN(had-GaN)。 TTN用于扩展清洁图像的纹理样本,并有助于分类器聚焦在其形状上。 GaN用于为模型形成培训框架并生成必要的图像。在MNIST,时尚 - MNIST和CIFAR10上进行的一系列实验表明,所提出的模型优于网络鲁棒性的最先进的防御方法。该模型还证明了对抗性实例的防御能力的显着改善。
translated by 谷歌翻译
从单个样本产生图像,作为图像合成的新发展分支,引起了广泛的关注。在本文中,我们将该问题与单个图像的条件分布进行采样,提出了一种分层框架,通过关于结构,语义和纹理的分布的连续学习来简化复杂条件分布的学习学习和一代可理解。在此基础上,我们设计由三个级联的GAN组成的Exsingan,用于从给定的图像学习可解释的生成模型,级联的GANS连续模拟结构,语义和纹理的分布。由于以前的作品所做的,但也是从给定图像的内部补丁来学习的,而且来自GaN反演技术的外部获得的外部。与先前作品相比,Exsingan对内部和外部信息的适当组合有利于内部和外部信息的适当组合,对图像操纵任务进行了更强大的生成和竞争泛化能力。
translated by 谷歌翻译
盲目图像超分辨率(SR)是CV的长期任务,旨在恢复患有未知和复杂扭曲的低分辨率图像。最近的工作主要集中在采用更复杂的退化模型来模拟真实世界的降级。由此产生的模型在感知损失和产量感知令人信服的结果取得了突破性。然而,电流生成的对抗性网络结构所带来的限制仍然是显着的:处理像素同样地导致图像的结构特征的无知,并且导致性能缺点,例如扭曲线和背景过度锐化或模糊。在本文中,我们提出了A-ESRAN,用于盲人SR任务的GAN模型,其特色是基于U-NET的U-NET的多尺度鉴别器,可以与其他发电机无缝集成。据我们所知,这是第一项介绍U-Net结构作为GaN解决盲人问题的鉴别者的工作。本文还给出了对模型的多规模注意力突破的机制的解释。通过对现有作品的比较实验,我们的模型在非参考自然图像质量评估员度量上提出了最先进的水平性能。我们的消融研究表明,利用我们的鉴别器,基于RRDB的发电机可以利用多种尺度中图像的结构特征,因此与先前作品相比,更加感知地产生了感知的高分辨率图像。
translated by 谷歌翻译
Learning a good image prior is a long-term goal for image restoration and manipulation. While existing methods like deep image prior (DIP) capture low-level image statistics, there are still gaps toward an image prior that captures rich image semantics including color, spatial coherence, textures, and high-level concepts. This work presents an effective way to exploit the image prior captured by a generative adversarial network (GAN) trained on large-scale natural images. As shown in Fig. 1, the deep generative prior (DGP) provides compelling results to restore missing semantics, e.g., color, patch, resolution, of various degraded images. It also enables diverse image manipulation including random jittering, image morphing, and category transfer. Such highly flexible restoration and manipulation are made possible through relaxing the assumption of existing GAN-inversion methods, which tend to fix the generator. Notably, we allow the generator to be fine-tuned on-the-fly in a progressive manner regularized by feature distance obtained by the discriminator in GAN. We show that these easy-to-implement and practical changes help preserve the reconstruction to remain in the manifold of nature image, and thus lead to more precise and faithful reconstruction for real images. Code is available at https://github.com/XingangPan/deepgenerative-prior.
translated by 谷歌翻译
当前的深层图像超分辨率(SR)方法试图从下采样的图像或假设简单高斯内核和添加噪声中降解来恢复高分辨率图像。但是,这种简单的图像处理技术代表了降低图像分辨率的现实世界过程的粗略近似。在本文中,我们提出了一个更现实的过程,通过引入新的内核对抗学习超分辨率(KASR)框架来处理现实世界图像SR问题,以降低图像分辨率。在提议的框架中,降解内核和噪声是自适应建模的,而不是明确指定的。此外,我们还提出了一个迭代监督过程和高频选择性目标,以进一步提高模型SR重建精度。广泛的实验验证了对现实数据集中提出的框架的有效性。
translated by 谷歌翻译
本文的目标是对面部素描合成(FSS)问题进行全面的研究。然而,由于获得了手绘草图数据集的高成本,因此缺乏完整的基准,用于评估过去十年的FSS算法的开发。因此,我们首先向FSS引入高质量的数据集,名为FS2K,其中包括2,104个图像素描对,跨越三种类型的草图样式,图像背景,照明条件,肤色和面部属性。 FS2K与以前的FSS数据集不同于难度,多样性和可扩展性,因此应促进FSS研究的进展。其次,我们通过调查139种古典方法,包括34个手工特征的面部素描合成方法,37个一般的神经式传输方法,43个深映像到图像翻译方法,以及35个图像 - 素描方法。此外,我们详细说明了现有的19个尖端模型的综合实验。第三,我们为FSS提供了一个简单的基准,名为FSGAN。只有两个直截了当的组件,即面部感知屏蔽和风格矢量扩展,FSGAN将超越所提出的FS2K数据集的所有先前最先进模型的性能,通过大边距。最后,我们在过去几年中汲取的经验教训,并指出了几个未解决的挑战。我们的开源代码可在https://github.com/dengpingfan/fsgan中获得。
translated by 谷歌翻译
本文提出了有条件生成对抗性网络(CGANS)的两个重要贡献,以改善利用此架构的各种应用。第一个主要贡献是对CGANS的分析表明它们没有明确条件。特别地,将显示鉴别者和随后的Cgan不会自动学习输入之间的条件。第二种贡献是一种新方法,称为逆时针,该方法通过新颖的逆损失明确地模拟了对抗架构的两部分的条件,涉及培训鉴别者学习无条件(不利)示例。这导致了用于GANS(逆学习)的新型数据增强方法,其允许使用不利示例将发电机的搜索空间限制为条件输出。通过提出概率分布分析,进行广泛的实验以评估判别符的条件。与不同应用的CGAN架构的比较显示了众所周知的数据集的性能的显着改进,包括使用不同度量的不同度量的语义图像合成,图像分割,单眼深度预测和“单个标签” - 图像(FID) ),平均联盟(Miou)交叉口,根均线误差日志(RMSE日志)和统计上不同的箱数(NDB)。
translated by 谷歌翻译
Our result (c) Application: Edit object appearance (b) Application: Change label types (a) Synthesized resultFigure 1: We propose a generative adversarial framework for synthesizing 2048 × 1024 images from semantic label maps (lower left corner in (a)). Compared to previous work [5], our results express more natural textures and details. (b) We can change labels in the original label map to create new scenes, like replacing trees with buildings. (c) Our framework also allows a user to edit the appearance of individual objects in the scene, e.g. changing the color of a car or the texture of a road. Please visit our website for more side-by-side comparisons as well as interactive editing demos.
translated by 谷歌翻译
在本文中,我们介绍了一种快速运动脱棕色条件的生成对抗网络(FMD-CGAN),其有助于单个图像的盲运动去纹理。 FMD-CGAN在去修改图像后提供令人印象深刻的结构相似性和视觉外观。与其他深度神经网络架构一样,GAN也遭受大型模型大小(参数)和计算。在诸如移动设备和机器人等资源约束设备上部署模型并不容易。借助MobileNet基于MobileNet的架构,包括深度可分离卷积,我们降低了模型大小和推理时间,而不会丢失图像的质量。更具体地说,我们将模型大小与最近的竞争对手相比将3-60倍。由此产生的压缩去掩盖CGAN比其最接近的竞争对手更快,甚至定性和定量结果优于各种最近提出的最先进的盲运动去误紧模型。我们还可以使用我们的模型进行实时映像解擦干任务。标准数据集的当前实验显示了该方法的有效性。
translated by 谷歌翻译
对抗训练方法是针对对抗性例子的最先进(SOTA)经验防御方法。事实证明,许多正则化方法与对抗训练的组合有效。然而,这种正则化方法是在时域中实现的。由于对抗性脆弱性可以被视为一种高频现象,因此必须调节频域中的对抗训练的神经网络模型。面对这些挑战,我们对小波的正则化属性进行了理论分析,可以增强对抗性训练。我们提出了一种基于HAAR小波分解的小波正则化方法,该方法称为小波平均池。该小波正则化模块集成到宽的残留神经网络中,因此形成了新的WideWavelEtResnet模型。在CIFAR-10和CIFAR-100的数据集上,我们提出的对抗小波训练方法在不同类型的攻击下实现了相当大的鲁棒性。它验证了以下假设:我们的小波正则化方法可以增强对抗性的鲁棒性,尤其是在深宽的神经网络中。实施了频率原理(F原理)和解释性的可视化实验,以显示我们方法的有效性。提出了基于不同小波碱函数的详细比较。该代码可在存储库中获得:\ url {https://github.com/momo1986/AdversarialWavelTraining}。
translated by 谷歌翻译
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. To facilitate GAN training, current methods propose to use data-specific augmentation techniques. Despite the effectiveness, it is difficult for these methods to scale to practical applications. In this work, we present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks. We first produce augmented samples using the convex combinations of the real samples. Then, we optimize the augmented samples by minimizing the norms of the data scores, i.e., the gradients of the log-density functions. This procedure enforces the augmented samples close to the data manifold. To estimate the scores, we train a deep estimation network with multi-scale score matching. For different image synthesis tasks, we train the score estimation network using different data. We do not require the tuning of the hyperparameters or modifications to the network architecture. The ScoreMix method effectively increases the diversity of data and reduces the overfitting problem. Moreover, it can be easily incorporated into existing GAN models with minor modifications. Experimental results on numerous tasks demonstrate that GAN models equipped with the ScoreMix method achieve significant improvements.
translated by 谷歌翻译
In biomedical image analysis, the applicability of deep learning methods is directly impacted by the quantity of image data available. This is due to deep learning models requiring large image datasets to provide high-level performance. Generative Adversarial Networks (GANs) have been widely utilized to address data limitations through the generation of synthetic biomedical images. GANs consist of two models. The generator, a model that learns how to produce synthetic images based on the feedback it receives. The discriminator, a model that classifies an image as synthetic or real and provides feedback to the generator. Throughout the training process, a GAN can experience several technical challenges that impede the generation of suitable synthetic imagery. First, the mode collapse problem whereby the generator either produces an identical image or produces a uniform image from distinct input features. Second, the non-convergence problem whereby the gradient descent optimizer fails to reach a Nash equilibrium. Thirdly, the vanishing gradient problem whereby unstable training behavior occurs due to the discriminator achieving optimal classification performance resulting in no meaningful feedback being provided to the generator. These problems result in the production of synthetic imagery that is blurry, unrealistic, and less diverse. To date, there has been no survey article outlining the impact of these technical challenges in the context of the biomedical imagery domain. This work presents a review and taxonomy based on solutions to the training problems of GANs in the biomedical imaging domain. This survey highlights important challenges and outlines future research directions about the training of GANs in the domain of biomedical imagery.
translated by 谷歌翻译
图像超分辨率(SR)是重要的图像处理方法之一,可改善计算机视野领域的图像分辨率。在过去的二十年中,在超级分辨率领域取得了重大进展,尤其是通过使用深度学习方法。这项调查是为了在深度学习的角度进行详细的调查,对单像超分辨率的最新进展进行详细的调查,同时还将告知图像超分辨率的初始经典方法。该调查将图像SR方法分类为四个类别,即经典方法,基于学习的方法,无监督学习的方法和特定领域的SR方法。我们还介绍了SR的问题,以提供有关图像质量指标,可用参考数据集和SR挑战的直觉。使用参考数据集评估基于深度学习的方法。一些审查的最先进的图像SR方法包括增强的深SR网络(EDSR),周期循环gan(Cincgan),多尺度残留网络(MSRN),Meta残留密度网络(META-RDN) ,反复反射网络(RBPN),二阶注意网络(SAN),SR反馈网络(SRFBN)和基于小波的残留注意网络(WRAN)。最后,这项调查以研究人员将解决SR的未来方向和趋势和开放问题的未来方向和趋势。
translated by 谷歌翻译
由于简单但有效的训练机制和出色的图像产生质量,生成的对抗网络(GAN)引起了极大的关注。具有生成照片现实的高分辨率(例如$ 1024 \ times1024 $)的能力,最近的GAN模型已大大缩小了生成的图像与真实图像之间的差距。因此,许多最近的作品表明,通过利用良好的潜在空间和博学的gan先验来利用预先训练的GAN模型的新兴兴趣。在本文中,我们简要回顾了从三个方面利用预先培训的大规模GAN模型的最新进展,即1)大规模生成对抗网络的培训,2)探索和理解预训练的GAN模型,以及预先培训的GAN模型,以及3)利用这些模型进行后续任务,例如图像恢复和编辑。有关相关方法和存储库的更多信息,请访问https://github.com/csmliu/pretretaining-gans。
translated by 谷歌翻译
With the development of convolutional neural networks, hundreds of deep learning based dehazing methods have been proposed. In this paper, we provide a comprehensive survey on supervised, semi-supervised, and unsupervised single image dehazing. We first discuss the physical model, datasets, network modules, loss functions, and evaluation metrics that are commonly used. Then, the main contributions of various dehazing algorithms are categorized and summarized. Further, quantitative and qualitative experiments of various baseline methods are carried out. Finally, the unsolved issues and challenges that can inspire the future research are pointed out. A collection of useful dehazing materials is available at \url{https://github.com/Xiaofeng-life/AwesomeDehazing}.
translated by 谷歌翻译
我们呈现SeveryGan,一种能够从单个输入示例自动生成砖纹理映射的方法。与大多数现有方法相比,专注于解决合成问题,我们的工作同时解决问题,合成和涤纶性。我们的关键思想是认识到,通过越野落扩展技术训练的生成网络内的潜伏空间产生具有在接缝交叉点的连续性的输出,然后可以通过裁剪中心区域进入彩色图像。由于不是潜在空间的每个值都有有效的来产生高质量的输出,因此我们利用鉴别者作为能够在采样过程中识别无伪纹理的感知误差度量。此外,与之前的深度纹理合成的工作相比,我们的模型设计和优化,以便使用多层纹理表示,使由多个地图组成的纹理,例如Albedo,法线等。我们广泛地测试网络的设计选择架构,丢失功能和采样参数。我们在定性和定量上展示我们的方法优于以前的方法和适用于不同类型的纹理。
translated by 谷歌翻译
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
translated by 谷歌翻译
交换自动编码器在深层图像操纵和图像到图像翻译中实现了最先进的性能。我们通过基于梯度逆转层引入简单而有效的辅助模块来改善这项工作。辅助模块的损失迫使发电机学会使用全零纹理代码重建图像,从而鼓励结构和纹理信息之间更好地分解。提出的基于属性的转移方法可以在样式传输中进行精致的控制,同时在不使用语义掩码的情况下保留结构信息。为了操纵图像,我们将对象的几何形状和输入图像的一般样式编码为两个潜在代码,并具有实施结构一致性的附加约束。此外,由于辅助损失,训练时间大大减少。提出的模型的优越性在复杂的域中得到了证明,例如已知最先进的卫星图像。最后,我们表明我们的模型改善了广泛的数据集的质量指标,同时通过多模式图像生成技术实现了可比的结果。
translated by 谷歌翻译