类别不平衡发生在许多实际应用程序中,包括图像分类,其中每个类中的图像数量显着不同。通过不平衡数据,生成的对抗网络(GANS)倾向于多数类样本。最近的两个方法,平衡GaN(Bagan)和改进的Bagan(Bagan-GP)被提出为增强工具来处理此问题并将余额恢复到数据。前者以无人监督的方式预先训练自动化器权重。但是,当来自不同类别的图像具有类似的特征时,它是不稳定的。后者通过促进监督的自动化培训培训,基于蒲甘进行改善,但预先培训偏向于多数阶级。在这项工作中,我们提出了一种新颖的条件变形式自动化器,具有用于生成的对抗性网络(CAPAN)的平衡训练,作为生成现实合成图像的增强工具。特别是,我们利用条件卷积改变自动化器,为GaN初始化和梯度惩罚培训提供了监督和平衡的预培训。我们所提出的方法在高度不平衡版本的MNIST,时尚 - MNIST,CIFAR-10和两个医学成像数据集中呈现出卓越的性能。我们的方法可以在FR \'回路截止距离,结构相似性指数测量和感知质量方面综合高质量的少数民族样本。
translated by 谷歌翻译
我们提出了一种具有多个鉴别器的生成的对抗性网络,其中每个鉴别者都专门用于区分真实数据集的子集。这种方法有助于学习与底层数据分布重合的发电机,从而减轻慢性模式崩溃问题。从多项选择学习的灵感来看,我们引导每个判别者在整个数据的子集中具有专业知识,并允许发电机在没有监督训练示例和鉴别者的数量的情况下自动找到潜伏和真实数据空间之间的合理对应关系。尽管使用多种鉴别器,但骨干网络在鉴别器中共享,并且培训成本的增加最小化。我们使用多个评估指标展示了我们算法在标准数据集中的有效性。
translated by 谷歌翻译
Supervised classification methods have been widely utilized for the quality assurance of the advanced manufacturing process, such as additive manufacturing (AM) for anomaly (defects) detection. However, since abnormal states (with defects) occur much less frequently than normal ones (without defects) in the manufacturing process, the number of sensor data samples collected from a normal state outweighs that from an abnormal state. This issue causes imbalanced training data for classification models, thus deteriorating the performance of detecting abnormal states in the process. It is beneficial to generate effective artificial sample data for the abnormal states to make a more balanced training set. To achieve this goal, this paper proposes a novel data augmentation method based on a generative adversarial network (GAN) using additive manufacturing process image sensor data. The novelty of our approach is that a standard GAN and classifier are jointly optimized with techniques to stabilize the learning process of standard GAN. The diverse and high-quality generated samples provide balanced training data to the classifier. The iterative optimization between GAN and classifier provides the high-performance classifier. The effectiveness of the proposed method is validated by both open-source data and real-world case studies in polymer and metal AM processes.
translated by 谷歌翻译
In biomedical image analysis, the applicability of deep learning methods is directly impacted by the quantity of image data available. This is due to deep learning models requiring large image datasets to provide high-level performance. Generative Adversarial Networks (GANs) have been widely utilized to address data limitations through the generation of synthetic biomedical images. GANs consist of two models. The generator, a model that learns how to produce synthetic images based on the feedback it receives. The discriminator, a model that classifies an image as synthetic or real and provides feedback to the generator. Throughout the training process, a GAN can experience several technical challenges that impede the generation of suitable synthetic imagery. First, the mode collapse problem whereby the generator either produces an identical image or produces a uniform image from distinct input features. Second, the non-convergence problem whereby the gradient descent optimizer fails to reach a Nash equilibrium. Thirdly, the vanishing gradient problem whereby unstable training behavior occurs due to the discriminator achieving optimal classification performance resulting in no meaningful feedback being provided to the generator. These problems result in the production of synthetic imagery that is blurry, unrealistic, and less diverse. To date, there has been no survey article outlining the impact of these technical challenges in the context of the biomedical imagery domain. This work presents a review and taxonomy based on solutions to the training problems of GANs in the biomedical imaging domain. This survey highlights important challenges and outlines future research directions about the training of GANs in the domain of biomedical imagery.
translated by 谷歌翻译
最近,诸如Interovae和S-Introvae之类的内省模型在图像生成和重建任务方面表现出色。内省模型的主要特征是对VAE的对抗性学习,编码器试图区分真实和假(即合成)图像。但是,由于有效度量标准无法评估真实图像和假图像之间的差异,因此后塌陷和消失的梯度问题仍然存在,从而降低了合成图像的保真度。在本文中,我们提出了一种称为对抗性相似性距离内省变化自动编码器(AS-Introvae)的新变体。我们理论上分析了消失的梯度问题,并使用2-Wasserstein距离和内核技巧构建了新的对抗相似性距离(AS-cantance)。随着重量退火,AS-Introvae能够产生稳定和高质量的图像。通过每批次尝试转换图像,以使其更好地适合潜在空间中的先前分布,从而解决了后塌陷问题。与每个图像方法相比,该策略促进了潜在空间中更多样化的分布,从而使我们的模型能够产生巨大的多样性图像。基准数据集的全面实验证明了AS-Introvae对图像生成和重建任务的有效性。
translated by 谷歌翻译
从文本描述中综合现实图像是计算机视觉中的主要挑战。当前对图像合成方法的文本缺乏产生代表文本描述符的高分辨率图像。大多数现有的研究都依赖于生成的对抗网络(GAN)或变异自动编码器(VAE)。甘斯具有产生更清晰的图像的能力,但缺乏输出的多样性,而VAE擅长生产各种输出,但是产生的图像通常是模糊的。考虑到gan和vaes的相对优势,我们提出了一个新的有条件VAE(CVAE)和条件gan(CGAN)网络架构,用于合成以文本描述为条件的图像。这项研究使用条件VAE作为初始发电机来生成文本描述符的高级草图。这款来自第一阶段的高级草图输出和文本描述符被用作条件GAN网络的输入。第二阶段GAN产生256x256高分辨率图像。所提出的体系结构受益于条件加强和有条件的GAN网络的残留块,以实现结果。使用CUB和Oxford-102数据集进行了多个实验,并将所提出方法的结果与Stackgan等最新技术进行了比较。实验表明,所提出的方法生成了以文本描述为条件的高分辨率图像,并使用两个数据集基于Inception和Frechet Inception评分产生竞争结果
translated by 谷歌翻译
基于深度学习的计算机辅助诊断(CAD)已成为医疗行业的重要诊断技术,有效提高诊断精度。然而,脑肿瘤磁共振(MR)图像数据集的稀缺性导致深度学习算法的低性能。传统数据增强(DA)生成的转换图像的分布本质上类似于原始的图像,从而在泛化能力方面产生有限的性能。这项工作提高了具有结构相似性损失功能(PGGAN-SSIM)的GAN的逐步生长,以解决图像模糊问题和模型崩溃。我们还探讨了其他基于GAN的数据增强,以证明所提出的模型的有效性。我们的结果表明,PGGAN-SSIM成功地生成了256x256的现实脑肿瘤MR图像,填充了原始数据集未发现的真实图像分布。此外,PGGAN-SSSIM超过了其他基于GAN的方法,实现了FRECHET成立距离(FID)和多尺度结构相似性(MS-SSIM)的有希望的性能提升。
translated by 谷歌翻译
物联网技术的开发使各种传感器可以集成到移动设备中。基于传感器数据的人类活动识别(HAR)已成为机器学习和无处不在计算领域的积极研究主题。但是,由于人类活动的频率不一致,人类活动数据集中的每个活动的数据量都会失衡。考虑到有限的传感器资源和手动标记的传感器数据的高成本,人类活动识别面临着高度不平衡的活动数据集的挑战。在本文中,我们建议平衡传感器数据生成的对抗网络(BSDGAN),以生成少数人类活动的传感器数据。所提出的BSDGAN由生成器模型和鉴别模型组成。考虑到人类活动数据集的极端失衡,使用自动编码器来初始化BSDGAN的训练过程,并确保可以学习每个活动的数据特征。生成的活动数据与原始数据集结合在一起,以平衡人类活动类别的活动数据量。我们在两个公开可用的人类活动数据集WISDM和UNIMIB上部署了多个人类活动识别模型。实验结果表明,提出的BSDGAN可以有效地捕获真实人类活动传感器数据的数据特征,并生成逼真的合成传感器数据。同时,平衡的活动数据集可以有效地帮助活动识别模型提高识别精度。
translated by 谷歌翻译
生成对抗网络(GAN)是最受欢迎的图像生成模型,在各种计算机视觉任务上取得了显着进度。但是,训练不稳定仍然是所有基于GAN的算法的开放问题之一。已经提出了许多方法来稳定gan的训练,其重点分别放在损失功能,正则化和归一化技术,训练算法和模型体系结构上。与上述方法不同,在本文中,提出了有关稳定gan训练的新观点。发现有时发电机产生的图像在训练过程中像歧视者的对抗示例一样,这可能是导致gan不稳定训练的原因的一部分。有了这一发现,我们提出了直接的对抗训练(DAT)方法来稳定gan的训练过程。此外,我们证明DAT方法能够适应歧视器的Lipschitz常数。 DAT的高级性能在多个损失功能,网络体系结构,超参数和数据集上进行了验证。具体而言,基于SSGAN的CIFAR-100无条件生成,DAT在CIFAR-100的无条件生成上实现了11.5%的FID,基于SSGAN的STL-10无条件生成的FID和基于SSGAN的LSUN卧室无条件生成的13.2%FID。代码将在https://github.com/iceli1007/dat-gan上找到
translated by 谷歌翻译
随着脑成像技术和机器学习工具的出现,很多努力都致力于构建计算模型来捕获人脑中的视觉信息的编码。最具挑战性的大脑解码任务之一是通过功能磁共振成像(FMRI)测量的脑活动的感知自然图像的精确重建。在这项工作中,我们调查了来自FMRI的自然图像重建的最新学习方法。我们在架构设计,基准数据集和评估指标方面检查这些方法,并在标准化评估指标上呈现公平的性能评估。最后,我们讨论了现有研究的优势和局限,并提出了潜在的未来方向。
translated by 谷歌翻译
Many image-to-image translation problems are ambiguous, as a single input image may correspond to multiple possible outputs. In this work, we aim to model a distribution of possible outputs in a conditional generative modeling setting. The ambiguity of the mapping is distilled in a low-dimensional latent vector, which can be randomly sampled at test time. A generator learns to map the given input, combined with this latent code, to the output. We explicitly encourage the connection between output and the latent code to be invertible. This helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse, and produces more diverse results. We explore several variants of this approach by employing different training objectives, network architectures, and methods of injecting the latent code. Our proposed method encourages bijective consistency between the latent encoding and output modes. We present a systematic comparison of our method and other variants on both perceptual realism and diversity.
translated by 谷歌翻译
虽然生成的对抗网络(GaN)是他们对其更高的样本质量的流行,而与其他生成模型相反,但是它们遭受同样困难的产生样本的难度。必须牢记各个方面,如产生的样本的质量,课程的多样性(在课堂内和类别中),使用解除戒开的潜在空间,所述评估度量的协议与人类感知等。本文,我们提出了一个新的评分,即GM分数,这取得了各种因素,如样品质量,解除戒备的代表,阶级,级别的阶级和级别多样性等各种因素,以及诸如精确,召回和F1分数等其他指标用于可怜的性深度信仰网络(DBN)和限制Boltzmann机(RBM)的潜在空间。评估是针对不同的GANS(GAN,DCGAN,BIGAN,CGAN,CONFORDGON,LSGAN,SGAN,WAN,以及WGAN改进)的不同GANS(GAN,DCGAN,BIGAN,SCAN,WANT)在基准MNIST数据集上培训。
translated by 谷歌翻译
我们研究了GaN调理问题,其目标是使用标记数据将普雷雷尼的无条件GaN转换为条件GaN。我们首先识别并分析这一问题的三种方法 - 从头开始​​,微调和输入重新编程的条件GaN培训。我们的分析表明,当标记数据的数量很小时,输入重新编程执行最佳。通过稀缺标记数据的现实世界情景,我们专注于输入重编程方法,并仔细分析现有算法。在识别出先前输入重新编程方法的一些关键问题之后,我们提出了一种名为INREP +的新算法。我们的算法INREP +解决了现有问题,具有可逆性神经网络的新颖用途和正面未标记(PU)学习。通过广泛的实验,我们表明Inrep +优于所有现有方法,特别是当标签信息稀缺,嘈杂和/或不平衡时。例如,对于用1%标记数据调节CiFar10 GaN的任务,Inrep +实现了82.13的平均峰值,而第二个最佳方法达到114.51。
translated by 谷歌翻译
近年来,拥抱集群研究中的表演学习的深度学习技术引起了广泛的关注,产生了一个新开发的聚类范式,QZ。深度聚类(DC)。通常,DC型号大写AutoEncoders,以了解促进聚类过程的内在特征。如今,一个名为变变AualEncoder(VAE)的生成模型在DC研究中得到了广泛的认可。然而,平原VAE不足以察觉到综合潜在特征,导致细分性能恶化。本文提出了一种新的DC方法来解决这个问题。具体地,生成的逆势网络和VAE被聚结成了一种名为Fusion AutoEncoder(FAE)的新的AutoEncoder,以辨别出更多的辨别性表示,从而使下游聚类任务受益。此外,FAE通过深度剩余网络架构实施,进一步提高了表示学习能力。最后,将FAE的潜在空间转变为由深密神经网络的嵌入空间,用于彼此从彼此拉出不同的簇,并将数据点折叠在单个簇内。在几个图像数据集上进行的实验证明了所提出的DC模型对基线方法的有效性。
translated by 谷歌翻译
The success of Deep Learning applications critically depends on the quality and scale of the underlying training data. Generative adversarial networks (GANs) can generate arbitrary large datasets, but diversity and fidelity are limited, which has recently been addressed by denoising diffusion probabilistic models (DDPMs) whose superiority has been demonstrated on natural images. In this study, we propose Medfusion, a conditional latent DDPM for medical images. We compare our DDPM-based model against GAN-based models, which constitute the current state-of-the-art in the medical domain. Medfusion was trained and compared with (i) StyleGan-3 on n=101,442 images from the AIROGS challenge dataset to generate fundoscopies with and without glaucoma, (ii) ProGAN on n=191,027 from the CheXpert dataset to generate radiographs with and without cardiomegaly and (iii) wGAN on n=19,557 images from the CRCMS dataset to generate histopathological images with and without microsatellite stability. In the AIROGS, CRMCS, and CheXpert datasets, Medfusion achieved lower (=better) FID than the GANs (11.63 versus 20.43, 30.03 versus 49.26, and 17.28 versus 84.31). Also, fidelity (precision) and diversity (recall) were higher (=better) for Medfusion in all three datasets. Our study shows that DDPM are a superior alternative to GANs for image synthesis in the medical domain.
translated by 谷歌翻译
与CNN的分类,分割或对象检测相比,生成网络的目标和方法根本不同。最初,它们不是作为图像分析工具,而是生成自然看起来的图像。已经提出了对抗性训练范式来稳定生成方法,并已被证明是非常成功的 - 尽管绝不是第一次尝试。本章对生成对抗网络(GAN)的动机进行了基本介绍,并通​​过抽象基本任务和工作机制并得出了早期实用方法的困难来追溯其成功的道路。将显示进行更稳定的训练方法,也将显示出不良收敛及其原因的典型迹象。尽管本章侧重于用于图像生成和图像分析的gan,但对抗性训练范式本身并非特定于图像,并且在图像分析中也概括了任务。在将GAN与最近进入场景的进一步生成建模方法进行对比之前,将闻名图像语义分割和异常检测的架构示例。这将允许对限制的上下文化观点,但也可以对gans有好处。
translated by 谷歌翻译
生成对抗网络(GAN)具有许多潜在的医学成像应用,包括数据扩展,域适应和模型解释。由于图形处理单元(GPU)的记忆力有限,因此在低分辨率的医学图像上对当前的3D GAN模型进行了训练,因此这些模型要么无法扩展到高分辨率,要么容易出现斑驳的人工制品。在这项工作中,我们提出了一种新颖的端到端GAN体系结构,可以生成高分辨率3D图像。我们通过使用训练和推理之间的不同配置来实现这一目标。在训练过程中,我们采用了层次结构,该结构同时生成图像的低分辨率版本和高分辨率图像的随机选择子量。层次设计具有两个优点:首先,对高分辨率图像训练的记忆需求在子量之间摊销。此外,将高分辨率子体积固定在单个低分辨率图像上可确保子量化之间的解剖一致性。在推断期间,我们的模型可以直接生成完整的高分辨率图像。我们还将具有类似层次结构的编码器纳入模型中,以从图像中提取特征。 3D胸CT和脑MRI的实验表明,我们的方法在图像生成中的表现优于最新技术。我们还证明了所提出的模型在数据增强和临床相关特征提取中的临床应用。
translated by 谷歌翻译
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
translated by 谷歌翻译
In recent years, applying deep learning (DL) to assess structural damages has gained growing popularity in vision-based structural health monitoring (SHM). However, both data deficiency and class-imbalance hinder the wide adoption of DL in practical applications of SHM. Common mitigation strategies include transfer learning, over-sampling, and under-sampling, yet these ad-hoc methods only provide limited performance boost that varies from one case to another. In this work, we introduce one variant of the Generative Adversarial Network (GAN), named the balanced semi-supervised GAN (BSS-GAN). It adopts the semi-supervised learning concept and applies balanced-batch sampling in training to resolve low-data and imbalanced-class problems. A series of computer experiments on concrete cracking and spalling classification were conducted under the low-data imbalanced-class regime with limited computing power. The results show that the BSS-GAN is able to achieve better damage detection in terms of recall and $F_\beta$ score than other conventional methods, indicating its state-of-the-art performance.
translated by 谷歌翻译
在本文中,我们探讨了基于GAN的少量数据增强用作改善少量分类性能的方法。我们对如何对这样的任务进行微调(其中一项是以课堂开采方式)进行微调的探索,以及对这些模型如何在改善几次分类的情况下进行严格的经验研究。我们确定了与纯粹有监督的制度训练此类生成模型的困难有关的问题,几乎没有例子,以及有关现有作品的评估协议的问题。我们还发现,在这种制度中,分类精度对数据集的类别随机分配方式高度敏感。因此,我们提出了一种半监督的微调方法,作为解决这些问题的更务实的方向。
translated by 谷歌翻译