从大规模嘈杂的面孔中学习强大的特征表示是高性能面部识别的关键挑战之一。最近通过减轻了阶层内冲突和阶级冲突来应对这一挑战。但是,每种冲突中无约束的噪声类型仍然使这些算法难以表现良好。为了更好地理解这一点,我们将每个类别的噪声类型以更细粒度的方式重新制定为n-身份| k^c-clusters。可以通过调整\ nkc的值来生成不同类型的嘈杂面。基于这种统一的公式,我们发现噪声射击表示学习背后的主要障碍是在不同的N,K和C下算法的灵活性。对于此潜在问题,我们提出了一种新方法,称为Evolving子中心学习〜(ESL),找到最佳的超平面,以准确描述大型嘈杂面的潜在空间。更具体地说,我们将每个类的M子中心初始化,ESL鼓励它通过生产,合并和丢弃操作自动与n-身份| k^c-clusters面对面。嘈杂面上属于相同身份的图像可以有效地收敛到同一子中心,并且具有不同身份的样本将被推开。我们通过对具有不同n,k和C的合成噪声数据集进行了精心的消融研究来检查其有效性
translated by 谷歌翻译
Recently, a popular line of research in face recognition is adopting margins in the well-established softmax loss function to maximize class separability. In this paper, we first introduce an Additive Angular Margin Loss (ArcFace), which not only has a clear geometric interpretation but also significantly enhances the discriminative power. Since ArcFace is susceptible to the massive label noise, we further propose sub-center ArcFace, in which each class contains K sub-centers and training samples only need to be close to any of the K positive sub-centers. Sub-center ArcFace encourages one dominant sub-class that contains the majority of clean faces and non-dominant sub-classes that include hard or noisy faces. Based on this self-propelled isolation, we boost the performance through automatically purifying raw web faces under massive real-world noise. Besides discriminative feature embedding, we also explore the inverse problem, mapping feature vectors to face images. Without training any additional generator or discriminator, the pre-trained ArcFace model can generate identity-preserved face images for both subjects inside and outside the training data only by using the network gradient and Batch Normalization (BN) priors. Extensive experiments demonstrate that ArcFace can enhance the discriminative feature embedding as well as strengthen the generative face synthesis.
translated by 谷歌翻译
基于软马克斯的损失函数及其变体(例如,界面,圆顶和弧形)可显着改善野生无约束场景中的面部识别性能。这些算法的一种常见实践是对嵌入特征和线性转换矩阵之间的乘法进行优化。但是,在大多数情况下,基于传统的设计经验给出了嵌入功能的尺寸,并且在给出固定尺寸时,使用该功能本身提高性能的研究较少。为了应对这一挑战,本文提出了一种称为subface的软关系近似方法,该方法采用了子空间功能来促进面部识别的性能。具体而言,我们在训练过程中动态选择每个批次中的非重叠子空间特征,然后使用子空间特征在基于软磁性的损失之间近似完整功能,因此,深层模型的可区分性可以显着增强,以增强面部识别。在基准数据集上进行的综合实验表明,我们的方法可以显着提高香草CNN基线的性能,这强烈证明了基于利润率的损失的子空间策略的有效性。
translated by 谷歌翻译
数据清洁,体系结构和损失功能设计是导致高性能面部识别的重要因素。以前,研究社区试图提高每个单个方面的性能,但未能在共同搜索所有三个方面的最佳设计时提出统一的解决方案。在本文中,我们首次确定这些方面彼此紧密结合。实际上,优化各个方面的设计实际上极大地限制了性能并偏向算法设计。具体而言,我们发现最佳模型体系结构或损耗函数与数据清洁紧密相结合。为了消除单一研究研究的偏见并提供对面部识别模型设计的总体理解,我们首先仔细设计了每个方面的搜索空间,然后引入了全面的搜索方法,以共同搜索最佳数据清洁,架构和损失功能设计。在我们的框架中,我们通过使用基于创新的增强学习方法来使拟议的全面搜索尽可能灵活。对百万级面部识别基准的广泛实验证明了我们新设计的搜索空间在每个方面和全面搜索的有效性。我们的表现要优于为每个研究轨道开发的专家算法。更重要的是,我们分析了我们搜索的最佳设计与单个因素的独立设计之间的差异。我们指出,强大的模型倾向于通过更困难的培训数据集和损失功能进行优化。我们的实证研究可以为未来的研究提供指导,以实现更健壮的面部识别系统。
translated by 谷歌翻译
Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important role in human life such as device unlock, mobile payment, and so on. This paper provides an introduction to face recognition, including its history, pipeline, algorithms based on conventional manually designed features or deep learning, mainstream training, evaluation datasets, and related applications. We have analyzed and compared state-of-the-art works as many as possible, and also carefully designed a set of experiments to find the effect of backbone size and data distribution. This survey is a material of the tutorial named The Practical Face Recognition Technology in the Industrial World in the FG2023.
translated by 谷歌翻译
Face recognition has made extraordinary progress owing to the advancement of deep convolutional neural networks (CNNs). The central task of face recognition, including face verification and identification, involves face feature discrimination. However, the traditional softmax loss of deep CNNs usually lacks the power of discrimination. To address this problem, recently several loss functions such as center loss, large margin softmax loss, and angular softmax loss have been proposed. All these improved losses share the same idea: maximizing inter-class variance and minimizing intra-class variance. In this paper, we propose a novel loss function, namely large margin cosine loss (LMCL), to realize this idea from a different perspective. More specifically, we reformulate the softmax loss as a cosine loss by L 2 normalizing both features and weight vectors to remove radial variations, based on which a cosine margin term is introduced to further maximize the decision margin in the angular space. As a result, minimum intra-class variance and maximum inter-class variance are achieved by virtue of normalization and cosine decision margin maximization. We refer to our model trained with LMCL as CosFace. Extensive experimental evaluations are conducted on the most popular public-domain face recognition datasets such as MegaFace Challenge, Youtube Faces (YTF) and Labeled Face in the Wild (LFW). We achieve the state-of-the-art performance on these benchmarks, which confirms the effectiveness of our proposed approach.
translated by 谷歌翻译
学习歧视性面部特征在建立高性能面部识别模型方面发挥着重要作用。最近的最先进的面部识别解决方案,提出了一种在常用的分类损失函数,Softmax损失中纳入固定的惩罚率,通过最大限度地减少级别的变化来增加面部识别模型的辨别力并最大化级别的帧间变化。边缘惩罚Softmax损失,如arcFace和Cosface,假设可以使用固定的惩罚余量同样地学习不同身份之间的测地距。然而,这种学习目标对于具有不一致的间帧内变化的真实数据并不是现实的,这可能限制了面部识别模型的判别和概括性。在本文中,我们通过提出弹性罚款损失(弹性面)来放松固定的罚款边缘约束,这允许在推动阶级可分离性中灵活性。主要思想是利用从每个训练迭代中的正常分布中汲取的随机保证金值。这旨在提供决策边界机会,以提取和缩回,以允许灵活的类别可分离学习的空间。我们展示了在大量主流基准上使用相同的几何变换,展示了我们的弹性面损失和COSFace损失的优势。从更广泛的角度来看,我们的弹性面在九个主流基准中提出了最先进的面部识别性能。
translated by 谷歌翻译
随着近期神经网络的成功,对人脸识别取得了显着进展。然而,收集面部识别的大规模现实世界培训数据已经挑战,特别是由于标签噪音和隐私问题。同时,通常从网络图像收集现有的面部识别数据集,缺乏关于属性的详细注释(例如,姿势和表达),因此对面部识别的不同属性的影响已经很差。在本文中,我们使用合成面部图像,即Synface来解决面部识别中的上述问题。具体而言,我们首先探讨用合成和真实面部图像训练的最近最先进的人脸识别模型之间的性能差距。然后,我们分析了性能差距背后的潜在原因,例如,较差的阶级变化和合成和真实面部图像之间的域间隙。灵感来自于此,我们使用身份混合(IM)和域混合(DM)设计了SYNFACE,以减轻上述性能差距,展示了对面部识别的综合数据的巨大潜力。此外,利用可控的面部合成模型,我们可以容易地管理合成面代的不同因素,包括姿势,表达,照明,身份的数量和每个身份的样本。因此,我们还对综合性面部图像进行系统实证分析,以提供一些关于如何有效利用综合数据进行人脸识别的见解。
translated by 谷歌翻译
Although significant progress has been made in face recognition, demographic bias still exists in face recognition systems. For instance, it usually happens that the face recognition performance for a certain demographic group is lower than the others. In this paper, we propose MixFairFace framework to improve the fairness in face recognition models. First of all, we argue that the commonly used attribute-based fairness metric is not appropriate for face recognition. A face recognition system can only be considered fair while every person has a close performance. Hence, we propose a new evaluation protocol to fairly evaluate the fairness performance of different approaches. Different from previous approaches that require sensitive attribute labels such as race and gender for reducing the demographic bias, we aim at addressing the identity bias in face representation, i.e., the performance inconsistency between different identities, without the need for sensitive attribute labels. To this end, we propose MixFair Adapter to determine and reduce the identity bias of training samples. Our extensive experiments demonstrate that our MixFairFace approach achieves state-of-the-art fairness performance on all benchmark datasets.
translated by 谷歌翻译
无监督的人重新识别(RE-ID)由于其可扩展性和对现实世界应用的可能性而吸引了增加的研究兴趣。最先进的无监督的重新ID方法通常遵循基于聚类的策略,该策略通过聚类来生成伪标签,并维护存储器以存储实例功能并代表群集的质心进行对比​​学习。这种方法遇到了两个问题。首先,无监督学习产生的质心可能不是一个完美的原型。强迫图像更接近质心,强调了聚类的结果,这可能会在迭代过程中积累聚类错误。其次,以前的方法利用在不同的训练迭代中获得的功能代表一种质心,这与当前的训练样本不一致,因为这些特征不是直接可比的。为此,我们通过随机学习策略提出了一种无监督的重新ID方法。具体来说,我们采用了随机更新的内存,其中使用集群的随机实例来更新群集级内存以进行对比度学习。这样,学会了随机选择的图像对之间的关​​系,以避免由不可靠的伪标签引起的训练偏见。随机内存也始终是最新的,以保持一致性。此外,为了减轻摄像机方差的问题,在聚类过程中提出了一个统一的距离矩阵,其中减少了不同摄像头域的距离偏置,并强调了身份的差异。
translated by 谷歌翻译
随着最近深度卷积神经网络的进步,一般面临的概念取得了重大进展。然而,最先进的一般面部识别模型对遮挡面部图像没有概括,这正是现实世界场景中的常见情况。潜在原因是用于训练和特定设计的大规模遮挡面部数据,用于解决闭塞所带来的损坏功能。本文提出了一种新颖的面部识别方法,其基于单端到端的深神经网络的闭塞是强大的。我们的方法(使用遮挡掩码)命名(面部识别),学会发现深度卷积神经网络的损坏功能,并通过动态学习的面具清洁它们。此外,我们构建了大规模的遮挡面部图像,从有效且有效地培训。与现有方法相比,依靠外部探测器发现遮挡或采用较少鉴别的浅模型的现有方法,从简单且功能强大。 LFW,Megaface挑战1,RMF2,AR数据集和其他模拟遮挡/掩蔽数据集的实验结果证实,从大幅提高了遮挡下的准确性,并概括了一般面部识别。
translated by 谷歌翻译
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for FOUR different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed.
translated by 谷歌翻译
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax [10] and Angular Softmax [9] have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available 1 .
translated by 谷歌翻译
应付嘈杂标签的大多数现有方法通常假定类别分布良好,因此无法应对训练样本不平衡分布的实际情况的能力不足。为此,本文尽早努力通过长尾分配和标签噪声来解决图像分类任务。在这种情况下,现有的噪声学习方法无法正常工作,因为将噪声样本与干净的尾巴类别的样本区分开来是具有挑战性的。为了解决这个问题,我们提出了一个新的学习范式,基于对弱数据和强数据扩展的推论,以筛选嘈杂的样本,并引入休假散布的正则化,以消除公认的嘈杂样本的效果。此外,我们基于在线先验分布中纳入了一种新颖的预测惩罚,以避免对头等阶层的偏见。与现有的长尾分类方法相比,这种机制在实时捕获班级拟合度方面具有优越性。详尽的实验表明,所提出的方法优于解决噪声标签下长尾分类中分布不平衡问题的最先进算法。
translated by 谷歌翻译
可以通过对手动预定义目标的监督(例如,一hot或Hadamard代码)进行深入的表示学习来解决细粒度的视觉分类。这种目标编码方案对于模型间相关性的灵活性较小,并且对稀疏和不平衡的数据分布也很敏感。鉴于此,本文介绍了一种新颖的目标编码方案 - 动态目标关系图(DTRG),作为辅助特征正则化,是一个自生成的结构输出,可根据输入图像映射。具体而言,类级特征中心的在线计算旨在在表示空间中生成跨类别距离,因此可以通过非参数方式通过动态图来描绘。明确最大程度地减少锚定在这些级别中心的阶层内特征变化可以鼓励学习判别特征。此外,由于利用了类间的依赖性,提出的目标图可以减轻代表学习中的数据稀疏性和不稳定。受混合风格数据增强的最新成功的启发,本文将随机性引入了动态目标关系图的软结构,以进一步探索目标类别的关系多样性。实验结果可以证明我们方法对多个视觉分类任务的许多不同基准的有效性,尤其是在流行的细粒对象基准上实现最先进的性能以及针对稀疏和不平衡数据的出色鲁棒性。源代码可在https://github.com/akonlau/dtrg上公开提供。
translated by 谷歌翻译
3D可线模型(3DMMS)是面部形状和外观的生成模型。然而,传统3DMMS的形状参数满足多变量高斯分布,而嵌入式嵌入满足过边距分布,并且这种冲突使得面部重建模型同时保持忠诚度和形状一致性的挑战。为了解决这个问题,我们提出了一种用于单眼脸部重建的新型3DMM的球体面部模型(SFM),这可以保持既有忠诚度和身份一致性。我们的SFM的核心是可以用于重建3D面形状的基矩阵,并且通过采用在第一和第二阶段中使用3D和2D训练数据的两级训练方法来学习基本矩阵。为了解决分发不匹配,我们设计一种新的损失,使形状参数具有超球的潜在空间。广泛的实验表明,SFM具有高表示能力和形状参数空间的聚类性能。此外,它产生富翼面形状,并且形状在单眼性重建中的挑战条件下是一致的。
translated by 谷歌翻译
最先进的面部识别方法通常采用多分类管道,并采用基于SoftMax的损耗进行优化。虽然这些方法取得了巨大的成功,但基于Softmax的损失在开放式分类的角度下有其限制:训练阶段的多分类目标并没有严格匹配开放式分类测试的目标。在本文中,我们派生了一个名为全局边界Cosface的新损失(GB-Cosface)。我们的GB-COSface介绍了自适应全局边界,以确定两个面积是否属于相同的身份,使得优化目标与从开放集分类的角度与测试过程对齐。同时,由于损失配方来自于基于软MAX的损失,因此我们的GB-COSFace保留了基于软MAX的损耗的优异性能,并且证明了COSFace是拟议损失的特殊情况。我们在几何上分析并解释了所提出的GB-Cosface。多面识别基准测试的综合实验表明,所提出的GB-Cosface优于主流面部识别任务中的当前最先进的面部识别损失。与Cosface相比,我们的GB-Cosface在Tar @ Far = 1E-6,1E-5,1E-4上提高了1.58%,0.57%和0.28%的IJB-C基准。
translated by 谷歌翻译
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations. Recent advances accomplish this task by leveraging clustering-based pseudo labels, but these pseudo labels are inevitably noisy which deteriorate model performance. In this paper, we propose a Neighbour Consistency guided Pseudo Label Refinement (NCPLR) framework, which can be regarded as a transductive form of label propagation under the assumption that the prediction of each example should be similar to its nearest neighbours'. Specifically, the refined label for each training instance can be obtained by the original clustering result and a weighted ensemble of its neighbours' predictions, with weights determined according to their similarities in the feature space. In addition, we consider the clustering-based unsupervised person ReID as a label-noise learning problem. Then, we proposed an explicit neighbour consistency regularization to reduce model susceptibility to over-fitting while improving the training stability. The NCPLR method is simple yet effective, and can be seamlessly integrated into existing clustering-based unsupervised algorithms. Extensive experimental results on five ReID datasets demonstrate the effectiveness of the proposed method, and showing superior performance to state-of-the-art methods by a large margin.
translated by 谷歌翻译
学习模态不变功能是可见热跨模板人员重新凝视(VT-REID)问题的核心,其中查询和画廊图像来自不同的模式。现有工作通过使用对抗性学习或仔细设计特征提取模块来隐式地将像素和特征空间中的模态对齐。我们提出了一个简单但有效的框架MMD-REID,通过明确的差异减少约束来降低模态差距。 MMD-REID从最大均值(MMD)中获取灵感,广泛使用的统计工具用于确定两个分布之间的距离。 MMD-REID采用新的基于边缘的配方,以匹配可见和热样品的类条件特征分布,以最大限度地减少级别的距离,同时保持特征辨别性。 MMD-Reid是一个简单的架构和损失制定方面的框架。我们对MMD-REID的有效性进行了广泛的实验,以使MMD-REID对调整边缘和阶级条件分布的有效性,从而学习模型无关和身份的一致特征。所提出的框架显着优于Sysu-MM01和RegDB数据集的最先进的方法。代码将在https://github.com/vcl-iisc/mmd -reid发布
translated by 谷歌翻译
视觉世界中新对象的不断出现对现实世界部署中当前的深度学习方法构成了巨大的挑战。由于稀有性或成本,新任务学习的挑战通常会加剧新类别的数据。在这里,我们探讨了几乎没有类别学习的重要任务(FSCIL)及其极端数据稀缺条件。理想的FSCIL模型都需要在所有类别上表现良好,无论其显示顺序或数据的匮乏。开放式现实世界条件也需要健壮,并可以轻松地适应始终在现场出现的新任务。在本文中,我们首先重新评估当前的任务设置,并为FSCIL任务提出更全面和实用的设置。然后,受到FSCIL和现代面部识别系统目标的相似性的启发,我们提出了我们的方法 - 增强角损失渐进分类或爱丽丝。在爱丽丝(Alice)中,我们建议使用角度损失损失来获得良好的特征。由于所获得的功能不仅需要紧凑,而且还需要足够多样化以维持未来的增量类别的概括,我们进一步讨论了类增强,数据增强和数据平衡如何影响分类性能。在包括CIFAR100,Miniimagenet和Cub200在内的基准数据集上的实验证明了爱丽丝在最新的FSCIL方法上的性能提高。
translated by 谷歌翻译