Privacy of machine learning models is one of the remaining challenges that hinder the broad adoption of Artificial Intelligent (AI). This paper considers this problem in the context of image datasets containing faces. Anonymization of such datasets is becoming increasingly important due to their central role in the training of autonomous cars, for example, and the vast amount of data generated by surveillance systems. While most prior work de-identifies facial images by modifying identity features in pixel space, we instead project the image onto the latent space of a Generative Adversarial Network (GAN) model, find the features that provide the biggest identity disentanglement, and then manipulate these features in latent space, pixel space, or both. The main contribution of the paper is the design of a feature-preserving anonymization framework, StyleID, which protects the individuals' identity, while preserving as many characteristics of the original faces in the image dataset as possible. As part of the contribution, we present a novel disentanglement metric, three complementing disentanglement methods, and new insights into identity disentanglement. StyleID provides tunable privacy, has low computational complexity, and is shown to outperform current state-of-the-art solutions.
translated by 谷歌翻译
Our goal with this survey is to provide an overview of the state of the art deep learning technologies for face generation and editing. We will cover popular latest architectures and discuss key ideas that make them work, such as inversion, latent representation, loss functions, training procedures, editing methods, and cross domain style transfer. We particularly focus on GAN-based architectures that have culminated in the StyleGAN approaches, which allow generation of high-quality face images and offer rich interfaces for controllable semantics editing and preserving photo quality. We aim to provide an entry point into the field for readers that have basic knowledge about the field of deep learning and are looking for an accessible introduction and overview.
translated by 谷歌翻译
Several face de-identification methods have been proposed to preserve users' privacy by obscuring their faces. These methods, however, can degrade the quality of photos, and they usually do not preserve the utility of faces, e.g., their age, gender, pose, and facial expression. Recently, advanced generative adversarial network models, such as StyleGAN, have been proposed, which generate realistic, high-quality imaginary faces. In this paper, we investigate the use of StyleGAN in generating de-identified faces through style mixing, where the styles or features of the target face and an auxiliary face get mixed to generate a de-identified face that carries the utilities of the target face. We examined this de-identification method with respect to preserving utility and privacy, by implementing several face detection, verification, and identification attacks. Through extensive experiments and also comparing with two state-of-the-art face de-identification methods, we show that StyleGAN preserves the quality and utility of the faces much better than the other approaches and also by choosing the style mixing levels correctly, it can preserve the privacy of the faces much better than other methods.
translated by 谷歌翻译
在本文中,我们解决了神经面部重演的问题,鉴于一对源和目标面部图像,我们需要通过将目标的姿势(定义为头部姿势及其面部表情定义)通过同时保留源的身份特征(例如面部形状,发型等),即使在源头和目标面属于不同身份的挑战性情况下也是如此。在此过程中,我们解决了最先进作品的一些局限在推理期间标记的数据以及c)它们不保留大型头部姿势变化中的身份。更具体地说,我们提出了一个框架,该框架使用未配对的随机生成的面部图像学会通过合并最近引入的样式空间$ \ Mathcal $ \ Mathcal {S} $ of Stylegan2的姿势,以将面部的身份特征从其姿势中解脱出来表现出显着的分解特性。通过利用这一点,我们学会使用3D模型的监督成功地混合了一对源和目标样式代码。随后用于重新制定的最终潜在代码由仅与源的面部姿势相对应的潜在单位和仅与源身份相对应的单位组成,从而显着改善了与最近的状态性能相比的重新制定性能。艺术方法。与艺术的状态相比,我们定量和定性地表明,即使在极端的姿势变化下,提出的方法也会产生更高的质量结果。最后,我们通过首先将它们嵌入预告片发电机的潜在空间来报告实际图像。我们在:https://github.com/stelabou/stylemask上公开提供代码和预估计的模型
translated by 谷歌翻译
生成模型的面部匿名化已经变得越来越普遍,因为它们通过生成虚拟面部图像来消毒私人信息,从而确保隐私和图像实用程序。在删除或保护原始身份后,通常无法识别此类虚拟面部图像。在本文中,我们将生成可识别的虚拟面部图像的问题形式化和解决。我们的虚拟脸部图像在视觉上与原始图像不同,以保护隐私保护。此外,它们具有新的虚拟身份,可直接用于面部识别。我们建议可识别的虚拟面部发电机(IVFG)生成虚拟面部图像。 IVFG根据用户特定的键将原始面部图像的潜在矢量投射到虚拟图像中,该键基于该图像生成虚拟面部图像。为了使虚拟面部图像可识别,我们提出了一个多任务学习目标以及一个三联生的培训策略,以学习IVFG。我们使用不同面部图像数据集上的不同面部识别器评估虚拟面部图像的性能,所有这些都证明了IVFG在生成可识别的虚拟面部图像中的有效性。
translated by 谷歌翻译
Although Generative Adversarial Networks (GANs) have made significant progress in face synthesis, there lacks enough understanding of what GANs have learned in the latent representation to map a random code to a photo-realistic image. In this work, we propose a framework called InterFaceGAN to interpret the disentangled face representation learned by the state-of-the-art GAN models and study the properties of the facial semantics encoded in the latent space. We first find that GANs learn various semantics in some linear subspaces of the latent space. After identifying these subspaces, we can realistically manipulate the corresponding facial attributes without retraining the model. We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection, resulting in more precise control of the attribute manipulation. Besides manipulating the gender, age, expression, and presence of eyeglasses, we can even alter the face pose and fix the artifacts accidentally made by GANs. Furthermore, we perform an in-depth face identity analysis and a layer-wise analysis to evaluate the editing results quantitatively. Finally, we apply our approach to real face editing by employing GAN inversion approaches and explicitly training feed-forward models based on the synthetic data established by InterFaceGAN. Extensive experimental results suggest that learning to synthesize faces spontaneously brings a disentangled and controllable face representation.
translated by 谷歌翻译
Recent 3D-aware GANs rely on volumetric rendering techniques to disentangle the pose and appearance of objects, de facto generating entire 3D volumes rather than single-view 2D images from a latent code. Complex image editing tasks can be performed in standard 2D-based GANs (e.g., StyleGAN models) as manipulation of latent dimensions. However, to the best of our knowledge, similar properties have only been partially explored for 3D-aware GAN models. This work aims to fill this gap by showing the limitations of existing methods and proposing LatentSwap3D, a model-agnostic approach designed to enable attribute editing in the latent space of pre-trained 3D-aware GANs. We first identify the most relevant dimensions in the latent space of the model controlling the targeted attribute by relying on the feature importance ranking of a random forest classifier. Then, to apply the transformation, we swap the top-K most relevant latent dimensions of the image being edited with an image exhibiting the desired attribute. Despite its simplicity, LatentSwap3D provides remarkable semantic edits in a disentangled manner and outperforms alternative approaches both qualitatively and quantitatively. We demonstrate our semantic edit approach on various 3D-aware generative models such as pi-GAN, GIRAFFE, StyleSDF, MVCGAN, EG3D and VolumeGAN, and on diverse datasets, such as FFHQ, AFHQ, Cats, MetFaces, and CompCars. The project page can be found: \url{https://enisimsar.github.io/latentswap3d/}.
translated by 谷歌翻译
现在,使用最近的生成对抗网络(GAN)可以使用高现实主义的不受约束图像产生。但是,用给定的一组属性生成图像非常具有挑战性。最近的方法使用基于样式的GAN模型来执行图像编辑,通过利用发电机层中存在的语义层次结构。我们提出了一些基于潜在的属性操纵和编辑(火焰),这是一个简单而有效的框架,可通过潜在空间操纵执行高度控制的图像编辑。具体而言,我们估计了控制生成图像中语义属性的潜在空间(预训练样式的)中的线性方向。与以前的方法相反,这些方法依赖于大规模属性标记的数据集或属性分类器,而火焰则使用一些策划的图像对的最小监督来估算删除的编辑指示。火焰可以在保留身份的同时,在各种图像集上同时进行高精度和顺序编辑。此外,我们提出了一项新颖的属性样式操纵任务,以生成各种样式的眼镜和头发等属性。我们首先编码相同身份的一组合成图像,但在潜在空间中具有不同的属性样式,以估计属性样式歧管。从该歧管中采样新的潜在将导致生成图像中的新属性样式。我们提出了一种新颖的抽样方法,以从歧管中采样潜在的样品,使我们能够生成各种属性样式,而不是训练集中存在的样式。火焰可以以分离的方式生成多种属性样式。我们通过广泛的定性和定量比较来说明火焰与先前的图像编辑方法相对于先前的图像编辑方法的卓越性能。火焰在多个数据集(例如汽车和教堂)上也很好地概括了。
translated by 谷歌翻译
通过大规模数据实现具有面部识别的高度安全的应用程序(如边境交叉路)需要广泛的生物识别性能测试。然而,使用真实面部图像引起了对隐私的担忧,因为法律不允许图像用于其他目的而不是最初的目的。使用代表和面部数据的子集还可以导致不需要的人口统计偏见并导致数据集不平衡。克服这些问题的一种可能解决方案是用综合生成的样本替换真实的面部图像。在生成合成图像的同时,从计算机视觉中的最新进步中受益,虽然有利于电脑视觉的最新进步,但在类似实际变化的同一合成标识的多个样本中仍然是不合适的,即交配样本。这项工作提出了一种通过利用样式牢固的潜在空间来生成配合的面部图像的非确定性方法。通过操纵潜伏的矢量来产生交配的样本,更精确地,我们利用主成分分析(PCA)来定义潜在空间中的语义有意义的方向,并使用预先训练的面部识别系统控制原始样本和配合样本之间的相似性。我们创建了由77,034个样本组成的合成面图像(Symface)的新数据集,包括25,919个合成ID。通过我们的分析,使用良好的面部图像质量指标,我们展示了模仿真实生物识别数据的特征的合成样本的生物识别质量的差异。其分析和结果表明使用使用所提出的方法创建的合成样本作为更换真实生物识别数据的可行替代品。
translated by 谷歌翻译
使用社交媒体网站和应用程序已经变得非常受欢迎,人们在这些网络上分享他们的照片。在这些网络上自动识别和标记人们的照片已经提出了隐私保存问题,用户寻求隐藏这些算法的方法。生成的对抗网络(GANS)被证明是非常强大的在高多样性中产生面部图像以及编辑面部图像。在本文中,我们提出了一种基于GAN的生成掩模引导的面部图像操纵(GMFIM)模型,以将无法察觉的编辑应用于输入面部图像以保护图像中的人的隐私。我们的模型由三个主要组件组成:a)面罩模块将面积从输入图像中切断并省略背景,b)用于操纵面部图像并隐藏身份的GaN的优化模块,并覆盖身份和c)用于组合输入图像的背景和操纵的去识别的面部图像的合并模块。在优化步骤的丢失功能中考虑了不同的标准,以产生与输入图像一样类似的高质量图像,同时不能通过AFR系统识别。不同数据集的实验结果表明,与最先进的方法相比,我们的模型可以实现对自动面部识别系统的更好的性能,并且它在大多数实验中捕获更高的攻击成功率。此外,我们提出的模型的产生图像具有最高的质量,更令人愉悦。
translated by 谷歌翻译
深度神经网络在人类分析中已经普遍存在,增强了应用的性能,例如生物识别识别,动作识别以及人重新识别。但是,此类网络的性能通过可用的培训数据缩放。在人类分析中,对大规模数据集的需求构成了严重的挑战,因为数据收集乏味,廉价,昂贵,并且必须遵守数据保护法。当前的研究研究了\ textit {合成数据}的生成,作为在现场收集真实数据的有效且具有隐私性的替代方案。这项调查介绍了基本定义和方法,在生成和采用合成数据进行人类分析时必不可少。我们进行了一项调查,总结了当前的最新方法以及使用合成数据的主要好处。我们还提供了公开可用的合成数据集和生成模型的概述。最后,我们讨论了该领域的局限性以及开放研究问题。这项调查旨在为人类分析领域的研究人员和从业人员提供。
translated by 谷歌翻译
面部合成的进步已经提出了关于合成面的欺骗性使用的警报。合成综合性可以有效地用于欺骗人类观察者吗?在本文中,我们介绍了使用不同策略产生的合成面的人类感知的研究,包括基于最先进的深学的GaN模型。这是第一次严格研究从心理学的实验技术接地的合成面代发电技术的有效性研究。我们回答了重要的问题,如GaN的频率和更传统的图像处理的技术混淆人类观察者,并且在综合性脸部图像中有细微的线索,导致人类将其视为假冒,而无需寻找明显的线索还为了回答这些问题,我们进行了一系列大规模众群行为实验,具有不同的面膜。结果表明,人类无法在几个不同的情况下区分真实面的合成面。这一发现对面部图像呈现给人类用户的许多不同应用具有严重影响。
translated by 谷歌翻译
Stone" "Mohawk hairstyle" "Without makeup" "Cute cat" "Lion" "Gothic church" * Equal contribution, ordered alphabetically. Code and video are available on https://github.com/orpatashnik/StyleCLIP
translated by 谷歌翻译
由于其语义上的理解和用户友好的可控性,通过三维引导,通过三维引导的面部图像操纵已广泛应用于各种交互式场景。然而,现有的基于3D形式模型的操作方法不可直接适用于域名面,例如非黑色素化绘画,卡通肖像,甚至是动物,主要是由于构建每个模型的强大困难具体面部域。为了克服这一挑战,据我们所知,我们建议使用人为3DMM操纵任意域名的第一种方法。这是通过两个主要步骤实现的:1)从3DMM参数解开映射到潜在的STYLEGO2的潜在空间嵌入,可确保每个语义属性的解除响应和精确的控制; 2)通过实施一致的潜空间嵌入,桥接域差异并使人类3DMM适用于域外面的人类3DMM。实验和比较展示了我们高质量的语义操作方法在各种面部域中的优越性,所有主要3D面部属性可控姿势,表达,形状,反照镜和照明。此外,我们开发了直观的编辑界面,以支持用户友好的控制和即时反馈。我们的项目页面是https://cassiepython.github.io/cddfm3d/index.html
translated by 谷歌翻译
鉴于其广泛的应用,已经对人面部交换的任务进行了许多尝试。尽管现有的方法主要依赖于乏味的网络和损失设计,但它们仍然在源和目标面之间的信息平衡中挣扎,并倾向于产生可见的人工制品。在这项工作中,我们引入了一个名为StylesWap的简洁有效的框架。我们的核心想法是利用基于样式的生成器来增强高保真性和稳健的面部交换,因此可以采用发电机的优势来优化身份相似性。我们仅通过最小的修改来确定,StyleGAN2体系结构可以成功地处理来自源和目标的所需信息。此外,受到TORGB层的启发,进一步设计了交换驱动的面具分支以改善信息的融合。此外,可以采用stylegan倒置的优势。特别是,提出了交换引导的ID反转策略来优化身份相似性。广泛的实验验证了我们的框架会产生高质量的面部交换结果,从而超过了最先进的方法,既有定性和定量。
translated by 谷歌翻译
Figure 1: Manipulating various facial attributes through varying the latent codes of a well-trained GAN model. The first column shows the original synthesis from PGGAN [21], while each of the other columns shows the results of manipulating a specific attribute.
translated by 谷歌翻译
We explore and analyze the latent style space of Style-GAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. We first show that StyleSpace, the space of channel-wise style parameters, is significantly more disentangled than the other intermediate latent spaces explored by previous works. Next, we describe a method for discovering a large collection of style channels, each of which is shown to control a distinct visual attribute in a highly localized and disentangled manner. Third, we propose a simple method for identifying style channels that control a specific attribute, using a pretrained classifier or a small number of example images. Manipulation of visual attributes via these StyleSpace controls is shown to be better disentangled than via those proposed in previous works. To show this, we make use of a newly proposed Attribute Dependency metric. Finally, we demonstrate the applicability of StyleSpace controls to the manipulation of real images. Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.
translated by 谷歌翻译
We introduce a new method for diverse foreground generation with explicit control over various factors. Existing image inpainting based foreground generation methods often struggle to generate diverse results and rarely allow users to explicitly control specific factors of variation (e.g., varying the facial identity or expression for face inpainting results). We leverage contrastive learning with latent codes to generate diverse foreground results for the same masked input. Specifically, we define two sets of latent codes, where one controls a pre-defined factor (``known''), and the other controls the remaining factors (``unknown''). The sampled latent codes from the two sets jointly bi-modulate the convolution kernels to guide the generator to synthesize diverse results. Experiments demonstrate the superiority of our method over state-of-the-arts in result diversity and generation controllability.
translated by 谷歌翻译
The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.
translated by 谷歌翻译