Although Generative Adversarial Networks achieve state-of-the-art results ona variety of generative tasks, they are regarded as highly unstable and proneto miss modes. We argue that these bad behaviors of GANs are due to the veryparticular functional shape of the trained discriminators in high dimensionalspaces, which can easily make training stuck or push probability mass in thewrong direction, towards that of higher concentration than that of the datagenerating distribution. We introduce several ways of regularizing theobjective, which can dramatically stabilize the training of GAN models. We alsoshow that our regularizers can help the fair distribution of probability massacross the modes of the data generating distribution, during the early phasesof training and thus providing a unified solution to the missing modes problem.
translated by 谷歌翻译
我们证明了变分自动编码器始终无法在潜在和可见空间中学习边缘分布。我们问这是否是匹配条件分布的结果,或者是显式模型和后验分布的限制。我们通过在变分推理中使用GenerativeAdversarial Networks来探索由边际分布匹配和隐式分布提供的替代方案。我们对几种VAE-GAN杂种进行了大规模的评估,并探索了类概率估计对学习分布的影响。我们得出结论,目前VAE-GAN杂交种的适用性有限:与VAE相比,它们难以扩展,评估和用于推理;并且他们没有改善GAN的发电质量。
translated by 谷歌翻译
生成性对抗网络(GAN)在机器学习领域受到广泛关注,因为它们有可能学习高维,复杂的数据分布。具体而言,它们不依赖于关于分布的任何假设,并且可以以简单的方式从潜在空间生成真实样本。这种强大的属性使GAN可以应用于各种应用,如图像合成,图像属性编辑,图像翻译,领域适应和其他学术领域。在本文中,我们的目的是为那些熟悉的读者讨论GAN的细节,但不要深入理解GAN或者希望从各个角度看待GAN。另外,我们解释了GAN如何运作以及最近提出的各种目标函数的基本含义。然后,我们将重点放在如何将GAN与自动编码器框架相结合。最后,我们列举了适用于各种任务和其他领域的GAN变体,适用于那些有兴趣利用GAN进行研究的人。
translated by 谷歌翻译
我们为Generative AdversarialNetworks(GAN)引入了有效的训练算法,以减轻模式崩溃和梯度消失。在我们的系统中,我们通过自动编码器(AE)约束发生器。我们提出一种方法,将来自AE的重建样本视为鉴别器的“真实”样本。这将AE的收敛与判别器的收敛耦合,有效地减慢了鉴别器的收敛和减少梯度消失。重要的是,我们提出了两种新的距离约束来改进发电机。首先,我们提出潜在数据距离约束,以强制潜在样本距离和相应的数据样本距离之间的兼容性。我们使用此约束来明确地防止模式崩溃的生成器。其次,我们提出了adiscriminator-score距离约束,以通过鉴别器核将生成样本的分布与实际样本的分布对齐。我们使用这个约束来指导生成器合成与真实样本重合的样本。我们提出的使用这些距离约束的GAN,即Dist-GAN,可以比基准数据集中的最先进方法获得更好的结果:合成,MNIST,MNIST-1K,CelebA,CIFAR-10和STL-10数据集。我们的代码发布在这里(https://github.com/tntrung/gan)用于研究。
translated by 谷歌翻译
本文提出了发散三角形作为发电机模型,能量模型和推理模型联合训练的框架。发散三角是一个紧凑且对称(反对称)的目标函数,它在一个统一的概率公式中无缝地整合了变分学习,对抗性学习,唤醒 - 睡眠算法和对比差异。这种统一使得采样,推理,能量评估的过程可以在没有需要昂贵的马尔可夫链蒙特卡罗方法。我们的实验证明,发散三角形能够学习(1)具有良好形成的能量景观的基于能量的模型,(2)以发电机网络的形式进行直接采样,以及(3)前馈推断,忠实地重建观察到的以及合成的数据。发散三角形是一种强大的训练方法,可以从不完整的数据中学习。
translated by 谷歌翻译
机器学习算法的成功通常取决于数据表示,我们假设这是因为不同的表示可以或多或少地隐藏数据背后变异的不同解释因素。虽然可以使用特定领域知识来帮助设计表示,但也可以使用通用先验学习,并且对AI的追求正在激励设计实现这些先验的更强大的表示 - 学习算法。本文回顾了无监督特征学习和深度学习领域的最新研究成果,涵盖了概率模型,自动编码器,流形学习和深度网络的进步。这激发了关于学习良好表征,计算表示(即推理)以及表示学习,密度估计和流形学习之间的几何联系的适当目标的长期未回答的问题。
translated by 谷歌翻译
我们为概率模型引入了一种新的训练原理,该原理可以替代最大可能性。所提出的生成随机网络(GSN)框架基于学习马尔可夫链的转移算子,其中静态分布估计数据分布。马尔可夫链的转移分布是以先前的状态为条件的,通常涉及一个小的移动,因此这种条件分布具有较少的主导模式,在小移动的限制内是单峰的。因此,它易于学习,因为它更容易近似其分区函数,更像是学习执行监督函数近似,具有可以通过backprop获得的梯度。我们提供的定理概括了最近关于去噪自动编码器的概率解释的工作,并获得了依赖网络和广义假性似然的有趣理由,以及适当的联合分布和采样机制的定义,即使条件不一致。 GSN可以与缺失的输入一起使用,并且可以用于对其余的变量子集进行采样。我们使用模拟Deep Boltzmann MachineGibbs采样器的架构,通过对两个图像数据集的实验来验证这些理论结果,但允许训练继续进行简单的反向推进,而无需进行分层预训练。
translated by 谷歌翻译
在只能访问密度的非标准化形式而非样本的情况下,我们研究对抗性学习。随着洞察力的增强,对抗性学习被扩展到一个人可以获得目标密度函数的非标准化形式u(x),但没有样本的情况。此外,基于从样本或来自u的学习,开发了GAN正则化中的新概念。 (X)。将所提出的方法与替代方法进行比较,在一系列应用中证明了令人鼓舞的结果,包括深度软Q学习。
translated by 谷歌翻译
我们提出了两种用于训练生成对抗网络(GAN)的新技术。我们的目标是减轻GAN中的模式崩溃并改善生成的样本的质量。首先,我们提出了邻居嵌入,基于学习的基于学习的正则化,以明确地保留生成的样本中的本地结构。这可以防止发生器从不同的潜在样本中生成大致相同的数据样本,并减少模式崩溃。我们提出了一种反t-SNE正则化器来实现这一目标。其次,我们提出了一种新技术,即梯度匹配,以对齐生成样本和实际样本的分布。由于使用高维样本分布进行处理具有挑战性,我们建议通过标量鉴别分数来对齐这些分布。我们约束实际样本和生成样本的鉴别器分数之间的差异。我们进一步证实了这些鉴别器分数的梯度之间的差异。我们从鉴别器函数的泰勒近似推导出这些约束。我们进行实验以证明我们提出的技术在计算上很简单,并且易于结合到现有系统中。当梯度匹配和邻域嵌入一起应用时,我们的GN-GAN在1D / 2D合成,CIFAR-10和STL-10数据集上获得了出色的结果。 ,例如STL-10数据集的FID得分为30.80美元。我们的代码位于:https://github.com/tntrung/gan
translated by 谷歌翻译
后验的表示是有效变分自动编码器(VAE)的关键方面。由于与真正的后部不匹配,对后部的不良选择对VAE的生成性能有很大的影响。我们扩展了可以通过使用无向图形模型获得的后验模型类。我们通过显示通过马尔可夫链蒙特卡罗更新的反向传播来计算关于无向后方参数的训练目标的梯度,开发了一种有效的方法来驱动无向后验。我们应用这些梯度估计器来训练具有Boltzmann机器外壳的离散VAE,并证明无向模型优于以前使用有向图形模型作为后验的结果。
translated by 谷歌翻译
尽管变分推断的近似分布的表征能力有所进步,但优化过程仍然可以限制最终学习的密度。我们证明了将真实后验偏向单峰的缺点,并将退火变分目标(AVO)引入到层次变分方法的训练中。受退火重要性抽样的启发,该方法通过将能量回火结合到优化目标中来促进学习。在我们的实验中,我们证明了我们的方法对确定性预热的鲁棒性,以及在潜在空间中鼓励探索的好处。
translated by 谷歌翻译
Recent work has shown how denoising and contractive autoencoders implicitly capture the structure of the data-generating density, in the case where the corruption noise is Gaussian, the reconstruction error is the squared error, and the data is continuous-valued. This has led to various proposals for sampling from this implicitly learned density function, using Langevin and Metropolis-Hastings MCMC. However, it remained unclear how to connect the training procedure of regularized auto-encoders to the implicit estimation of the underlying data-generating distribution when the data are discrete, or using other forms of corruption process and reconstruction errors. Another issue is the mathematical justification which is only valid in the limit of small corruption noise. We propose here a different attack on the problem, which deals with all these issues: arbitrary (but noisy enough) corruption, arbitrary reconstruction loss (seen as a log-likelihood), handling both discrete and continuous-valued variables, and removing the bias due to non-infinitesimal corruption noise (or non-infinitesimal contractive penalty).
translated by 谷歌翻译
We propose an augmented training procedure for generative adversarial networks designed to address shortcomings of the original by directing the generator towards probable configurations of abstract discriminator features. We estimate and track the distribution of these features, as computed from data, with a denoising auto-encoder, and use it to propose high-level targets for the generator. We combine this new loss with the original and evaluate the hybrid criterion on the task of unsupervised image synthesis from datasets comprising a diverse set of visual categories, noting a qualitative and quantitative improvement in the "objectness" of the resulting samples.
translated by 谷歌翻译
Generating high-resolution, photo-realistic images has been a long-standinggoal in machine learning. Recently, Nguyen et al. (2016) showed one interestingway to synthesize novel images by performing gradient ascent in the latentspace of a generator network to maximize the activations of one or multipleneurons in a separate classifier network. In this paper we extend this methodby introducing an additional prior on the latent code, improving both samplequality and sample diversity, leading to a state-of-the-art generative modelthat produces high quality images at higher resolutions (227x227) than previousgenerative models, and does so for all 1000 ImageNet categories. In addition,we provide a unified probabilistic interpretation of related activationmaximization methods and call the general class of models "Plug and PlayGenerative Networks". PPGNs are composed of 1) a generator network G that iscapable of drawing a wide range of image types and 2) a replaceable "condition"network C that tells the generator what to draw. We demonstrate the generationof images conditioned on a class (when C is an ImageNet or MIT Placesclassification network) and also conditioned on a caption (when C is an imagecaptioning network). Our method also improves the state of the art ofMultifaceted Feature Visualization, which generates the set of synthetic inputsthat activate a neuron in order to better understand how deep neural networksoperate. Finally, we show that our model performs reasonably well at the taskof image inpainting. While image models are used in this paper, the approach ismodality-agnostic and can be applied to many types of data.
translated by 谷歌翻译
We propose the Wasserstein Auto-Encoder (WAE)-a new algorithm for building a gen-erative model of the data distribution. WAE minimizes a penalized form of the Wasserstein distance between the model distribution and the target distribution, which leads to a different regularizer than the one used by the Variational Auto-Encoder (VAE) [1]. This regularizer encourages the encoded training distribution to match the prior. We compare our algorithm with several other techniques and show that it is a generalization of adversarial auto-encoders (AAE) [2]. Our experiments show that WAE shares many of the properties of VAEs (sta-ble training, encoder-decoder architecture, nice latent manifold structure) while generating samples of better quality, as measured by the FID score.
translated by 谷歌翻译
Existing Markov Chain Monte Carlo (MCMC) methods are either based on general-purpose and domain-agnostic schemes, which can lead to slow convergence, or problem-specific proposals hand-crafted by an expert. In this paper, we propose A-NICE-MC, a novel method to automatically design efficient Markov chain kernels tailored for a specific domain. First, we propose an efficient likelihood-free adver-sarial training method to train a Markov chain and mimic a given data distribution. Then, we leverage flexible volume preserving flows to obtain parametric kernels for MCMC. Using a bootstrap approach, we show how to train efficient Markov chains to sample from a prescribed posterior distribution by iteratively improving the quality of both the model and the samples. Empirical results demonstrate that A-NICE-MC combines the strong guarantees of MCMC with the expressiveness of deep neural networks, and is able to significantly outperform competing methods such as Hamiltonian Monte Carlo.
translated by 谷歌翻译
我们引入了一种稳定生成对抗网络(GAN)的方法,通过相对于鉴别器的展开优化来定义生成器目标。这允许在使用发生器目标中的最佳鉴别器之间调整训练,这在实践中是理想的但是不可行的,并且使用鉴别器的当前值,其通常不稳定并且导致差的解决方案。我们展示了这种技术如何解决模式崩溃的常见问题,通过复杂的发电机稳定GAN的训练,并增加发电机的数据分布的多样性和覆盖范围。
translated by 谷歌翻译
生成对抗网络(GAN)是用于学习来自样本的复杂数据分布的生成模型的创新技术。尽管最近在生成逼真图像方面取得了显着的进步,但是它们的主要缺点之一是,在实践中,即使在对不同数据集进行训练时,它们也倾向于生成具有很小多样性的样本。这种被称为模式崩溃的现象一直是GAN最近几项进展的主要焦点。然而,很少有人理解为什么模式崩溃发生,而且即将出现的方法能够缓解模式崩溃。我们提出了处理模式崩溃的原则方法,我们称之为打包。主要思想是使鉴别器基于来自同一类的多个样本做出决策,无论是真实的还是人工生成的。我们借用二元假设检验的分析工具 - 特别是Blackwell [Bla53]的开创性结果---来证明包装和模式崩溃之间的基本联系。我们证明了包装自然会对模式崩溃的发电机进行处罚,从而减少了发电机的分布。模式在训练过程中崩溃。基准数据集的数值实验表明,包装在实践中也提供了显着的改进。
translated by 谷歌翻译
The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although domain knowledge can be used to help design representations, learning can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, manifold learning, and deep learning. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning.
translated by 谷歌翻译
许多重要的数据分析应用程序存在与目标变量相关的严重不平衡的数据集。典型的例子是医学图像分析,其中阳性样品是稀缺的,而通常根据这些阳性实例的正确检测来估计性能。我们通过将问题设计为具有生成模型的异常检测来应对这一挑战。我们在没有监督“负”(共同)数据点的情况下训练生成模型,并使用该模型来估计看不见的数据的可能性。一个成功的模型允许我们将“正”情况检测为低可能性数据点。在这份立场文件中,我们介绍了使用最先进的深度生成模型(GAN和VAE)来估计数据的可能性。我们的结果表明,一方面GAN和VAE都能够分离出来。 MNIST案例中的“正面”和“负面”样本。另一方面,对于NLST案例,GAN和VAE都无法捕获数据的复杂性,并在此任务所需的级别上区分异常。这些结果表明,尽管在文献中提出了在类似应用中使用生成模型的许多成功,但仍然存在进一步成功实施的进一步挑战。
translated by 谷歌翻译