We present a variety of new architectural features and training procedures that we apply to the generative adversarial networks (GANs) framework. We focus on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic. Unlike most work on generative models, our primary goal is not to train a model that assigns high likelihood to test data, nor do we require the model to be able to learn well without using any labels. Using our new techniques, we achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN. The generated images are of high quality as confirmed by a visual Turing test: our model generates MNIST samples that humans cannot distinguish from real data, and CIFAR-10 samples that yield a human error rate of 21.3%. We also present ImageNet samples with unprecedented resolution and show that our methods enable the model to learn recognizable features of ImageNet classes.
translated by 谷歌翻译
由于能够产生与实际数据的显着统计相似性的高质量数据,生成的对抗性网络(GANS)最近在AI社区中引起了相当大的关注。从根本上,GaN是在训练中以越野方式训练的两个神经网络之间的游戏,以达到零和纳什均衡轮廓。尽管在过去几年中在GAN完成了改进,但仍有几个问题仍有待解决。本文评论了GANS游戏理论方面的文献,并解决了游戏理论模型如何应对生成模型的特殊挑战,提高GAN的表现。我们首先提出一些预备,包括基本GaN模型和一些博弈论背景。然后,我们将分类系统将最先进的解决方案分为三个主要类别:修改的游戏模型,修改的架构和修改的学习方法。分类基于通过文献中提出的游戏理论方法对基本GaN模型进行的修改。然后,我们探讨每个类别的目标,并讨论每个类别的最新作品。最后,我们讨论了这一领域的剩余挑战,并提出了未来的研究方向。
translated by 谷歌翻译
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
translated by 谷歌翻译
这是关于生成对抗性网络(GaN),对抗性自身额外的教程和调查纸张及其变体。我们开始解释对抗性学习和香草甘。然后,我们解释了条件GaN和DCGAN。介绍了模式崩溃问题,介绍了各种方法,包括小纤维GaN,展开GaN,Bourgan,混合GaN,D2Gan和Wasserstein GaN,用于解决这个问题。然后,GaN中的最大似然估计与F-GaN,对抗性变分贝叶斯和贝叶斯甘甘相同。然后,我们涵盖了GaN,Infogan,Gran,Lsgan,Enfogan,Gran,Lsgan,Catgan,MMD Gan,Lapgan,Progressive Gan,Triple Gan,Lag,Gman,Adagan,Cogan,逆甘,Bigan,Ali,Sagan,Sagan,Sagan,Sagan,甘肃,甘肃,甘河的插值和评估。然后,我们介绍了GaN的一些应用,例如图像到图像转换(包括Pacchgan,Cyclegan,Deepfacedrawing,模拟GaN,Interactive GaN),文本到图像转换(包括Stackgan)和混合图像特征(包括罚球和mixnmatch)。最后,我们解释了基于对冲学习的AutoEncoders,包括对手AutoEncoder,Pixelgan和隐式AutoEncoder。
translated by 谷歌翻译
Classification using supervised learning requires annotating a large amount of classes-balanced data for model training and testing. This has practically limited the scope of applications with supervised learning, in particular deep learning. To address the issues associated with limited and imbalanced data, this paper introduces a sample-efficient co-supervised learning paradigm (SEC-CGAN), in which a conditional generative adversarial network (CGAN) is trained alongside the classifier and supplements semantics-conditioned, confidence-aware synthesized examples to the annotated data during the training process. In this setting, the CGAN not only serves as a co-supervisor but also provides complementary quality examples to aid the classifier training in an end-to-end fashion. Experiments demonstrate that the proposed SEC-CGAN outperforms the external classifier GAN (EC-GAN) and a baseline ResNet-18 classifier. For the comparison, all classifiers in above methods adopt the ResNet-18 architecture as the backbone. Particularly, for the Street View House Numbers dataset, using the 5% of training data, a test accuracy of 90.26% is achieved by SEC-CGAN as opposed to 88.59% by EC-GAN and 87.17% by the baseline classifier; for the highway image dataset, using the 10% of training data, a test accuracy of 98.27% is achieved by SEC-CGAN, compared to 97.84% by EC-GAN and 95.52% by the baseline classifier.
translated by 谷歌翻译
标签数据的可用性有限,使任何有监督的学习问题具有挑战性。诸如半监督和大学学习之类的替代学习设置可以减轻对标记数据的依赖,但仍需要大量未标记的数据,这可能不可用或昂贵。基于GAN的数据生成方法最近通过生成合成样本来改善学习来表现出希望。但是,在有限的标记数据设置下,大多数现有的基于GAN的方法要么提供差的歧视效果。或导致低质量生成的数据。在本文中,我们提出了一个GAN游戏,该游戏在有限的数据设置下提供了改进的歧视器精度,同时生成了高质量的现实数据。我们进一步提出了不断发展的歧视损失,从而改善了其收敛性和泛化性能。我们得出理论保证并提供经验结果以支持我们的方法。
translated by 谷歌翻译
Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially-costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.
translated by 谷歌翻译
虽然生成的对抗网络(GaN)是他们对其更高的样本质量的流行,而与其他生成模型相反,但是它们遭受同样困难的产生样本的难度。必须牢记各个方面,如产生的样本的质量,课程的多样性(在课堂内和类别中),使用解除戒开的潜在空间,所述评估度量的协议与人类感知等。本文,我们提出了一个新的评分,即GM分数,这取得了各种因素,如样品质量,解除戒备的代表,阶级,级别的阶级和级别多样性等各种因素,以及诸如精确,召回和F1分数等其他指标用于可怜的性深度信仰网络(DBN)和限制Boltzmann机(RBM)的潜在空间。评估是针对不同的GANS(GAN,DCGAN,BIGAN,CGAN,CONFORDGON,LSGAN,SGAN,WAN,以及WGAN改进)的不同GANS(GAN,DCGAN,BIGAN,SCAN,WANT)在基准MNIST数据集上培训。
translated by 谷歌翻译
Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LS-GANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson χ 2 divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
translated by 谷歌翻译
In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128 × 128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128 × 128 samples are more than twice as discriminable as artificially resized 32 × 32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.
translated by 谷歌翻译
从积极和未标记的数据(又称PU学习)中学习的问题已在二进制(即阳性与负面)分类设置中进行了研究,其中输入数据包括(1)从正类别及其相应标签的观察结果,((( 2)来自正面和负面类别的未标记观察结果。生成对抗网络(GAN)已被用来将问题减少到监督环境中,其优势是,监督学习在分类任务中具有最新的精度。为了生成\ textIt {pseudo}阴性观察,甘恩(GAN)接受了正面和未标记的观测值的培训,并修改了损失。同时使用正面和\ textit {pseudo} - 阴性观察会导致监督的学习设置。现实到足以替代缺失的负类样品的伪阴性观察的产生是当前基于GAN的算法的瓶颈。通过在GAN体系结构中加入附加的分类器,我们提供了一种基于GAN的新方法。在我们建议的方法中,GAN歧视器指示发电机仅生成掉入未标记的数据分布中的样品,而第二分类器(观察者)网络将GAN训练监视为:(i)防止生成的样品落入正分布中; (ii)学习正面观察和负面观测之间的关键区别的特征。四个图像数据集的实验表明,我们训练有素的观察者网络在区分实际看不见的正和负样本时的性能优于现有技术。
translated by 谷歌翻译
与CNN的分类,分割或对象检测相比,生成网络的目标和方法根本不同。最初,它们不是作为图像分析工具,而是生成自然看起来的图像。已经提出了对抗性训练范式来稳定生成方法,并已被证明是非常成功的 - 尽管绝不是第一次尝试。本章对生成对抗网络(GAN)的动机进行了基本介绍,并通​​过抽象基本任务和工作机制并得出了早期实用方法的困难来追溯其成功的道路。将显示进行更稳定的训练方法,也将显示出不良收敛及其原因的典型迹象。尽管本章侧重于用于图像生成和图像分析的gan,但对抗性训练范式本身并非特定于图像,并且在图像分析中也概括了任务。在将GAN与最近进入场景的进一步生成建模方法进行对比之前,将闻名图像语义分割和异常检测的架构示例。这将允许对限制的上下文化观点,但也可以对gans有好处。
translated by 谷歌翻译
我们提出了一种具有多个鉴别器的生成的对抗性网络,其中每个鉴别者都专门用于区分真实数据集的子集。这种方法有助于学习与底层数据分布重合的发电机,从而减轻慢性模式崩溃问题。从多项选择学习的灵感来看,我们引导每个判别者在整个数据的子集中具有专业知识,并允许发电机在没有监督训练示例和鉴别者的数量的情况下自动找到潜伏和真实数据空间之间的合理对应关系。尽管使用多种鉴别器,但骨干网络在鉴别器中共享,并且培训成本的增加最小化。我们使用多个评估指标展示了我们算法在标准数据集中的有效性。
translated by 谷歌翻译
本文提出了有条件生成对抗性网络(CGANS)的两个重要贡献,以改善利用此架构的各种应用。第一个主要贡献是对CGANS的分析表明它们没有明确条件。特别地,将显示鉴别者和随后的Cgan不会自动学习输入之间的条件。第二种贡献是一种新方法,称为逆时针,该方法通过新颖的逆损失明确地模拟了对抗架构的两部分的条件,涉及培训鉴别者学习无条件(不利)示例。这导致了用于GANS(逆学习)的新型数据增强方法,其允许使用不利示例将发电机的搜索空间限制为条件输出。通过提出概率分布分析,进行广泛的实验以评估判别符的条件。与不同应用的CGAN架构的比较显示了众所周知的数据集的性能的显着改进,包括使用不同度量的不同度量的语义图像合成,图像分割,单眼深度预测和“单个标签” - 图像(FID) ),平均联盟(Miou)交叉口,根均线误差日志(RMSE日志)和统计上不同的箱数(NDB)。
translated by 谷歌翻译
从文本描述中综合现实图像是计算机视觉中的主要挑战。当前对图像合成方法的文本缺乏产生代表文本描述符的高分辨率图像。大多数现有的研究都依赖于生成的对抗网络(GAN)或变异自动编码器(VAE)。甘斯具有产生更清晰的图像的能力,但缺乏输出的多样性,而VAE擅长生产各种输出,但是产生的图像通常是模糊的。考虑到gan和vaes的相对优势,我们提出了一个新的有条件VAE(CVAE)和条件gan(CGAN)网络架构,用于合成以文本描述为条件的图像。这项研究使用条件VAE作为初始发电机来生成文本描述符的高级草图。这款来自第一阶段的高级草图输出和文本描述符被用作条件GAN网络的输入。第二阶段GAN产生256x256高分辨率图像。所提出的体系结构受益于条件加强和有条件的GAN网络的残留块,以实现结果。使用CUB和Oxford-102数据集进行了多个实验,并将所提出方法的结果与Stackgan等最新技术进行了比较。实验表明,所提出的方法生成了以文本描述为条件的高分辨率图像,并使用两个数据集基于Inception和Frechet Inception评分产生竞争结果
translated by 谷歌翻译
这是一门专门针对STEM学生开发的介绍性机器学习课程。我们的目标是为有兴趣的读者提供基础知识,以在自己的项目中使用机器学习,并将自己熟悉术语作为进一步阅读相关文献的基础。在这些讲义中,我们讨论受监督,无监督和强化学习。注释从没有神经网络的机器学习方法的说明开始,例如原理分析,T-SNE,聚类以及线性回归和线性分类器。我们继续介绍基本和先进的神经网络结构,例如密集的进料和常规神经网络,经常性的神经网络,受限的玻尔兹曼机器,(变性)自动编码器,生成的对抗性网络。讨论了潜在空间表示的解释性问题,并使用梦和对抗性攻击的例子。最后一部分致力于加强学习,我们在其中介绍了价值功能和政策学习的基本概念。
translated by 谷歌翻译
In biomedical image analysis, the applicability of deep learning methods is directly impacted by the quantity of image data available. This is due to deep learning models requiring large image datasets to provide high-level performance. Generative Adversarial Networks (GANs) have been widely utilized to address data limitations through the generation of synthetic biomedical images. GANs consist of two models. The generator, a model that learns how to produce synthetic images based on the feedback it receives. The discriminator, a model that classifies an image as synthetic or real and provides feedback to the generator. Throughout the training process, a GAN can experience several technical challenges that impede the generation of suitable synthetic imagery. First, the mode collapse problem whereby the generator either produces an identical image or produces a uniform image from distinct input features. Second, the non-convergence problem whereby the gradient descent optimizer fails to reach a Nash equilibrium. Thirdly, the vanishing gradient problem whereby unstable training behavior occurs due to the discriminator achieving optimal classification performance resulting in no meaningful feedback being provided to the generator. These problems result in the production of synthetic imagery that is blurry, unrealistic, and less diverse. To date, there has been no survey article outlining the impact of these technical challenges in the context of the biomedical imagery domain. This work presents a review and taxonomy based on solutions to the training problems of GANs in the biomedical imaging domain. This survey highlights important challenges and outlines future research directions about the training of GANs in the domain of biomedical imagery.
translated by 谷歌翻译
艺术是一种使用数字技术作为生成或创造过程的一部分的艺术方法。随着数字货币和NFT(不可杀死的代币)的出现,对数字艺术的需求正在积极增长。在本手稿中,我们主张将深层生成网络和对抗性训练进行稳定和变体的艺术生成的概念。这项工作主要集中于使用深卷积生成对抗网络(DC-GAN),并探讨了解决GAN训练中常见陷阱的技术。我们比较DC-GAN的各种架构和设计,以为稳定而逼真的一代提供推荐的设计选择。这项工作的主要重点是生成现实中不存在但由提议的模型从随机噪声中合成的逼真图像。我们提供了生成的动物面部图像(一些显示物种混合物的证据)的视觉结果以及训练,建筑和设计选择的建议。我们还展示了训练图像预处理如何在GAN培训中起着重要作用。
translated by 谷歌翻译
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks -demonstrating their applicability as general image representations.
translated by 谷歌翻译
这项工作探讨了三人游戏训练动力,在哪个条件下,三人游戏融合并融合了平衡。与先前的工作相反,我们研究了三人游戏架构,所有玩家都明确地相互互动。先前的工作分析了游戏,其中三个代理只与另一个玩家互动,构成了双重玩家游戏。我们使用简化的双线性平滑游戏的扩展版本探索三人游戏训练动力学,称为简化的三线性平滑游戏。我们发现,在大多数情况下,三连线游戏不会在NASH平衡上融合,而是在固定点上汇聚,这对于两个玩家来说是最佳的,但对于第三名而言则不是。此外,我们探讨了更新的顺序如何影响融合。除了交替和同时更新外,我们还探索了一个新的更新订单 - 最大化器优先 - 仅在三人游戏中才有可能。我们发现,三人游戏可以使用最大化器优先更新在NASH平衡上收敛。最后,我们在所有三个更新订单下在三线性平滑游戏中为每个玩家的动量值不同,并表明最大化器优先更新在一组比其他更新订单的较大的播放器动量值三合会中获得了更最佳的结果。
translated by 谷歌翻译