学习分解的表示形式需要监督或引入特定模型设计和学习限制作为偏见。Infogan是一个流行的分离框架,通过最大化潜在表示及其相应生成的图像之间的相互信息来学习无监督的分解表示形式。通过引入辅助网络和潜在回归损失的培训来实现共同信息的最大化。在这篇简短的探索性论文中,我们研究了希尔伯特 - 史密特独立标准(HSIC)的使用,以近似潜在表示和图像之间的相互信息,称为HSIC-INFOGAN。直接优化HSIC损失可以避免需要额外的辅助网络。我们定性地比较了每个模型中的分离水平,提出了一种调整HSIC-INFOGAN超参数的策略,并讨论了HSIC-INFOGAN在医疗应用中的潜力。
translated by 谷歌翻译
This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound of the mutual information objective that can be optimized efficiently. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods.
translated by 谷歌翻译
We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation. We propose FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions. We show that it improves upon β-VAE by providing a better trade-off between disentanglement and reconstruction quality. Moreover, we highlight the problems of a commonly used disentanglement metric and introduce a new metric that does not suffer from them.
translated by 谷歌翻译
甚至在没有受限,监督的情况下,也提出了甚至在没有受限或有限的情况下学习普遍陈述的方法。使用适度数量的数据可以微调新的目标任务,或者直接在相应任务中实现显着性能的无奈域中使用的良好普遍表示。这种缓解数据和注释要求为计算机愿景和医疗保健的应用提供了诱人的前景。在本辅导纸上,我们激励了对解散的陈述,目前关键理论和详细的实际构建块和学习此类表示的标准的需求。我们讨论医学成像和计算机视觉中的应用,强调了在示例钥匙作品中进行的选择。我们通过呈现剩下的挑战和机会来结束。
translated by 谷歌翻译
无负的对比度学习吸引了很多关注,以简单性和令人印象深刻的表现,以进行大规模预处理。但是它的解散财产仍未得到探索。在本文中,我们采用不同的无负对比度学习方法来研究这种自我监督方法的分离特性。我们发现现有的分离指标无法对高维表示模型进行有意义的测量,因此我们根据表示因素和数据因素之间的相互信息提出了一个新的分解指标。通过拟议的指标,我们首次在流行的合成数据集和现实世界数据集Celeba上首次基于无效的对比度学习的删除属性。我们的研究表明,研究的方法可以学习一个明确的表示子集。我们首次将对分离的表示学习的研究扩展到高维表示空间和无效的对比度学习。建议的度量标准的实现可在\ url {https://github.com/noahcao/disentangeslement_lib_med}中获得。
translated by 谷歌翻译
变异因素之间的相关性在现实数据中普遍存在。机器学习算法可能会受益于利用这种相关性,因为它们可以提高噪声数据的预测性能。然而,通常这种相关性不稳定(例如,它们可能在域,数据集或应用程序之间发生变化),我们希望避免利用它们。解剖学方法旨在学习捕获潜伏子空间变化不同因素的表示。常用方法涉及最小化潜伏子空间之间的相互信息,使得每个潜在的底层属性。但是,当属性相关时,这会失败。我们通过强制执行可用属性上的子空间之间的独立性来解决此问题,这允许我们仅删除不导致的依赖性,这些依赖性是由于训练数据中存在的相关结构。我们通过普发的方法实现这一目标,以最小化关于分类变量的子空间之间的条件互信息(CMI)。我们首先在理论上展示了CMI最小化是对高斯数据线性问题的稳健性解剖的良好目标。然后,我们基于MNIST和Celeba在现实世界数据集上应用我们的方法,并表明它会在相关偏移下产生脱屑和强大的模型,包括弱监督设置。
translated by 谷歌翻译
The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train more than 12 000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, increased disentanglement does not seem to lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.
translated by 谷歌翻译
散布和不变的表示是代表学习的两个关键目标,并且已经提出了许多方法来实现其中的一个。但是,这两个目标实际上是相互补充的,因此我们提出了一个框架,以同时完成两个目标。我们引入了一个弱监督的信号,以学习解开表示的表示,该表示由三个拆分组成,分别包含预测性,已知滋扰和未知的滋扰信息。此外,我们结合了对比度的实施表示不变性的方法。实验表明,所提出的方法在四个标准基准上优于最先进的方法(SOTA)方法,并表明该方法可以具有更好的对抗性防御能力,而没有对抗训练的其他方法。
translated by 谷歌翻译
我们提出了一个通过信息瓶颈约束来学习CAPSNET的学习框架的框架,该框架将信息提炼成紧凑的形式,并激励学习可解释的分解化胶囊。在我们的$ \ beta $ -capsnet框架中,使用超参数$ \ beta $用于权衡解开和其他任务,使用变异推理将信息瓶颈术语转换为kl divergence,以近似为约束胶囊。为了进行监督学习,使用类独立掩码矢量来理解合成的变化类型,无论图像类别类别,我们通过调整参数$ \ beta $来进行大量的定量和定性实验,以找出分离,重建和细节之间的关系表现。此外,提出了无监督的$ \ beta $ -capsnet和相应的动态路由算法,以学习范围的方式,以一种无监督的方式学习解散胶囊,广泛的经验评估表明我们的$ \ beta $ -CAPPAPSNET可实现的是先进的分离性截止性性能比较在监督和无监督场景中的几个复杂数据集上的CAPSNET和各种基线。
translated by 谷歌翻译
We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables. We use this to motivate the β-TCVAE (Total Correlation Variational Autoencoder) algorithm, a refinement and plug-in replacement of the β-VAE for learning disentangled representations, requiring no additional hyperparameters during training. We further propose a principled classifier-free measure of disentanglement called the mutual information gap (MIG). We perform extensive quantitative and qualitative experiments, in both restricted and non-restricted settings, and show a strong relation between total correlation and disentanglement, when the model is trained using our framework.
translated by 谷歌翻译
A grand goal in deep learning research is to learn representations capable of generalizing across distribution shifts. Disentanglement is one promising direction aimed at aligning a models representations with the underlying factors generating the data (e.g. color or background). Existing disentanglement methods, however, rely on an often unrealistic assumption: that factors are statistically independent. In reality, factors (like object color and shape) are correlated. To address this limitation, we propose a relaxed disentanglement criterion - the Hausdorff Factorized Support (HFS) criterion - that encourages a factorized support, rather than a factorial distribution, by minimizing a Hausdorff distance. This allows for arbitrary distributions of the factors over their support, including correlations between them. We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks, even under severe training correlations and correlation shifts, with in parts over +60% in relative improvement over existing disentanglement methods. In addition, we find that leveraging HFS for representation learning can even facilitate transfer to downstream tasks such as classification under distribution shifts. We hope our original approach and positive empirical results inspire further progress on the open problem of robust generalization.
translated by 谷歌翻译
变化自动编码器(VAE)最近已用于对复杂密度分布的无监督分离学习。存在许多变体,以鼓励潜在空间中的分解,同时改善重建。但是,在达到极低的重建误差和高度分离得分之间,没有人同时管理权衡。我们提出了一个普遍的框架,可以在有限的优化下应对这一挑战,并证明它在平衡重建时,它优于现有模型的最先进模型。我们介绍了三个可控的拉格朗日超级参数,以控制重建损失,KL差异损失和相关度量。我们证明,重建网络中的信息最大化等于在合理假设和约束放松下摊销过程中的信息最大化。
translated by 谷歌翻译
安全部署到现实世界的机器学习模式通常是一个具有挑战性的过程。从特定地理位置获得的数据训练的模型往往会在询问其他地方获得的数据时失败,在仿真中培训的代理可以在部署在现实世界或新颖的环境中进行适应时,以及适合于拟合的神经网络人口可能会将一些选择偏见纳入其决策过程。在这项工作中,我们描述了(i)通过(i)识别和描述了不同误差来源的新信息 - 理论观点的数据转移问题,(ii)比较最近域概括和公平探讨的一些最有前景的目标分类文献。从我们的理论分析和实证评估中,我们得出结论,需要通过关于观察到的数据,用于校正的因素的仔细考虑和数据生成过程的结构来指导模型选择程序。
translated by 谷歌翻译
在连续时间域上表示为随机微分方程的基于扩散的方法最近已证明是一种非对抗性生成模型。培训此类模型依赖于denoising得分匹配,可以将其视为多尺度的Denoising自动编码器。在这里,我们扩大了Denoising分数匹配框架,以实现表示无监督信号的表示。 GAN和VAE通过将潜在代码直接转换为数据样本来学习表示形式。相比之下,引入的基于扩散的表示学习依赖于Denoisising分数匹配目标的新公式,因此编码了DeNoising所需的信息。我们说明了这种差异如何允许对表示中编码的细节级别进行手动控制。使用相同的方法,我们建议学习无限维度的潜在代码,该代码可在半监督图像分类中改善最先进的模型。我们还将扩散评分匹配的学术表示表示与自动编码器等其他方法的质量进行比较,并通过其在下游任务上的性能进行对比训练的系统。
translated by 谷歌翻译
学习公平的代表性对于实现公平或宣传敏感信息至关重要。大多数现有的作品都依靠对抗表示学习将一些不变性注入表示形式。但是,已知对抗性学习方法受到相对不稳定的训练的痛苦,这可能会损害公平性和代表性预测之间的平衡。我们提出了一种新的方法,通过分布对比度变异自动编码器(Farconvae)学习公平表示,该方法诱导潜在空间分解为敏感和非敏感部分。我们首先构建具有不同敏感属性但具有相同标签的观测值。然后,Farconvae强制执行每个不敏感的潜在潜在,而敏感的潜在潜在的潜伏期彼此之间的距离也很远,并且还远离非敏感的潜在通过对比它们的分布。我们提供了一种由高斯和Student-T内核动机的新型对比损失,用于通过理论分析进行分配对比学习。此外,我们采用新的掉期重建损失,进一步提高分解。 Farconvae在公平性,预处理的模型偏差以及来自各种模式(包括表格,图像和文本)的领域概括任务方面表现出了卓越的性能。
translated by 谷歌翻译
This work investigates unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation's suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and compares favorably with fully-supervised learning on several classification tasks in with some standard architectures. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation learning objectives for specific end-goals.
translated by 谷歌翻译
提出了一种新的双峰生成模型,用于生成条件样品和关节样品,并采用学习简洁的瓶颈表示的训练方法。所提出的模型被称为变异Wyner模型,是基于网络信息理论中的两个经典问题(分布式仿真和信道综合)设计的,其中Wyner的共同信息是对公共表示简洁性的基本限制。该模型是通过最大程度地减少对称的kullback的训练 - 差异 - 变异分布和模型分布之间具有正则化项,用于常见信息,重建一致性和潜在空间匹配项,该术语是通过对逆密度比率估计技术进行的。通过与合成和现实世界数据集的联合和有条件生成的实验以及具有挑战性的零照片图像检索任务,证明了所提出的方法的实用性。
translated by 谷歌翻译
The combination of machine learning models with physical models is a recent research path to learn robust data representations. In this paper, we introduce p$^3$VAE, a generative model that integrates a perfect physical model which partially explains the true underlying factors of variation in the data. To fully leverage our hybrid design, we propose a semi-supervised optimization procedure and an inference scheme that comes along meaningful uncertainty estimates. We apply p$^3$VAE to the semantic segmentation of high-resolution hyperspectral remote sensing images. Our experiments on a simulated data set demonstrated the benefits of our hybrid model against conventional machine learning models in terms of extrapolation capabilities and interpretability. In particular, we show that p$^3$VAE naturally has high disentanglement capabilities. Our code and data have been made publicly available at https://github.com/Romain3Ch216/p3VAE.
translated by 谷歌翻译
Self-supervised learning is a popular and powerful method for utilizing large amounts of unlabeled data, for which a wide variety of training objectives have been proposed in the literature. In this study, we perform a Bayesian analysis of state-of-the-art self-supervised learning objectives and propose a unified formulation based on likelihood learning. Our analysis suggests a simple method for integrating self-supervised learning with generative models, allowing for the joint training of these two seemingly distinct approaches. We refer to this combined framework as GEDI, which stands for GEnerative and DIscriminative training. Additionally, we demonstrate an instantiation of the GEDI framework by integrating an energy-based model with a cluster-based self-supervised learning model. Through experiments on synthetic and real-world data, including SVHN, CIFAR10, and CIFAR100, we show that GEDI outperforms existing self-supervised learning strategies in terms of clustering performance by a wide margin. We also demonstrate that GEDI can be integrated into a neural-symbolic framework to address tasks in the small data regime, where it can use logical constraints to further improve clustering and classification performance.
translated by 谷歌翻译
深度神经网络已经证明了他们可以从数据中自动提取有意义的功能的能力。但是,在监督学习中,特定于用于培训的数据集的信息,但与手头的任务无关,可以在提取的表示中仍然被编码。该剩余信息引入了特定于域的偏差,削弱了泛化性能。在这项工作中,我们建议将信息分成与任务相关的表示及其互补情境表示。我们提出了一种原始方法,结合对抗特征预测器和循环重建,以解开单域监督案例中的这两个表示。然后,我们将该方法适应无监督的域适应问题,包括训练能够在源域和目标域上执行的模型。特别是,尽管没有训练标签,我们的方法促进了目标领域的解剖学。这使得能够将特定于域的特定任务信息隔离为公共表示。任务特定的表示允许有效地将从源域获取的知识转移到目标域。在单一域案中,我们展示了我们关于信息检索任务的陈述的质量以及由锐化的任务特定陈述引起的泛化效益。然后,我们在几个古典域适应基准上验证所提出的方法,并说明了解除域适应的益处。
translated by 谷歌翻译