在这项工作中,我们提出了一种新的损失,以提高特征可怜和分类性能。通过自适应余弦/相干估计(ACE)的动机,我们的提出方法包括由人工神经网络本质学学习的角度信息。我们的学习ACE(蕾丝)将数据转换为新的“白细胞”空间,可提高级别的间可分离性和级别的紧凑性。我们将我们的蕾丝与基于艺术艺术品的替代最终的和功能正则化方法进行比较。我们的研究结果表明,该方法可以作为交叉熵和角度软墨水方法的可行替代方案。我们的代码是公开的:https://github.com/gatorsense/lace。
translated by 谷歌翻译
最近的工作据称,利用Softmax跨熵的分类损失不仅可以用于固定设定的分类任务,而且还通过专门为开放式任务开发的优于开销的损失,包括几次射击学习和检索。使用不同的嵌入几何形状研究了软MAX分类器 - 欧几里德,双曲线和球形,并且已经对一个或另一个的优越性进行了索赔,但它们没有得到精心控制的系统。我们对各种固定设定分类和图像检索任务的软MAX损失嵌入几何的实证研究。对于球形损失观察到的一个有趣的财产导致我们提出了一种基于VON MISES-FISHER分配的概率分类器,我们表明它具有最先进的方法竞争,同时生产出完善的盒子校准。我们提供有关亏损之间的权衡以及如何在其中选择的指导。
translated by 谷歌翻译
以前的工作提出了许多新的损失函数和常规程序,可提高图像分类任务的测试准确性。但是,目前尚不清楚这些损失函数是否了解下游任务的更好表示。本文研究了培训目标的选择如何影响卷积神经网络隐藏表示的可转移性,训练在想象中。我们展示了许多目标在Vanilla Softmax交叉熵上导致想象的精度有统计学意义的改进,但由此产生的固定特征提取器转移到下游任务基本较差,并且当网络完全微调时,损失的选择几乎没有效果新任务。使用居中内核对齐来测量网络隐藏表示之间的相似性,我们发现损失函数之间的差异仅在网络的最后几层中都很明显。我们深入了解倒数第二层的陈述,发现不同的目标和近奇计的组合导致大幅不同的类别分离。具有较高类别分离的表示可以在原始任务上获得更高的准确性,但它们的功能对于下游任务不太有用。我们的结果表明,用于原始任务的学习不变功能与传输任务相关的功能之间存在权衡。
translated by 谷歌翻译
The classification loss functions used in deep neural network classifiers can be grouped into two categories based on maximizing the margin in either Euclidean or angular spaces. Euclidean distances between sample vectors are used during classification for the methods maximizing the margin in Euclidean spaces whereas the Cosine similarity distance is used during the testing stage for the methods maximizing margin in the angular spaces. This paper introduces a novel classification loss that maximizes the margin in both the Euclidean and angular spaces at the same time. This way, the Euclidean and Cosine distances will produce similar and consistent results and complement each other, which will in turn improve the accuracies. The proposed loss function enforces the samples of classes to cluster around the centers that represent them. The centers approximating classes are chosen from the boundary of a hypersphere, and the pairwise distances between class centers are always equivalent. This restriction corresponds to choosing centers from the vertices of a regular simplex. There is not any hyperparameter that must be set by the user in the proposed loss function, therefore the use of the proposed method is extremely easy for classical classification problems. Moreover, since the class samples are compactly clustered around their corresponding means, the proposed classifier is also very suitable for open set recognition problems where test samples can come from the unknown classes that are not seen in the training phase. Experimental studies show that the proposed method achieves the state-of-the-art accuracies on open set recognition despite its simplicity.
translated by 谷歌翻译
Recently, a popular line of research in face recognition is adopting margins in the well-established softmax loss function to maximize class separability. In this paper, we first introduce an Additive Angular Margin Loss (ArcFace), which not only has a clear geometric interpretation but also significantly enhances the discriminative power. Since ArcFace is susceptible to the massive label noise, we further propose sub-center ArcFace, in which each class contains K sub-centers and training samples only need to be close to any of the K positive sub-centers. Sub-center ArcFace encourages one dominant sub-class that contains the majority of clean faces and non-dominant sub-classes that include hard or noisy faces. Based on this self-propelled isolation, we boost the performance through automatically purifying raw web faces under massive real-world noise. Besides discriminative feature embedding, we also explore the inverse problem, mapping feature vectors to face images. Without training any additional generator or discriminator, the pre-trained ArcFace model can generate identity-preserved face images for both subjects inside and outside the training data only by using the network gradient and Batch Normalization (BN) priors. Extensive experiments demonstrate that ArcFace can enhance the discriminative feature embedding as well as strengthen the generative face synthesis.
translated by 谷歌翻译
Face recognition has made extraordinary progress owing to the advancement of deep convolutional neural networks (CNNs). The central task of face recognition, including face verification and identification, involves face feature discrimination. However, the traditional softmax loss of deep CNNs usually lacks the power of discrimination. To address this problem, recently several loss functions such as center loss, large margin softmax loss, and angular softmax loss have been proposed. All these improved losses share the same idea: maximizing inter-class variance and minimizing intra-class variance. In this paper, we propose a novel loss function, namely large margin cosine loss (LMCL), to realize this idea from a different perspective. More specifically, we reformulate the softmax loss as a cosine loss by L 2 normalizing both features and weight vectors to remove radial variations, based on which a cosine margin term is introduced to further maximize the decision margin in the angular space. As a result, minimum intra-class variance and maximum inter-class variance are achieved by virtue of normalization and cosine decision margin maximization. We refer to our model trained with LMCL as CosFace. Extensive experimental evaluations are conducted on the most popular public-domain face recognition datasets such as MegaFace Challenge, Youtube Faces (YTF) and Labeled Face in the Wild (LFW). We achieve the state-of-the-art performance on these benchmarks, which confirms the effectiveness of our proposed approach.
translated by 谷歌翻译
This paper addresses deep face recognition (FR) problem under open-set protocol, where ideal face features are expected to have smaller maximal intra-class distance than minimal inter-class distance under a suitably chosen metric space. However, few existing algorithms can effectively achieve this criterion. To this end, we propose the angular softmax (A-Softmax) loss that enables convolutional neural networks (CNNs) to learn angularly discriminative features. Geometrically, A-Softmax loss can be viewed as imposing discriminative constraints on a hypersphere manifold, which intrinsically matches the prior that faces also lie on a manifold. Moreover, the size of angular margin can be quantitatively adjusted by a parameter m. We further derive specific m to approximate the ideal feature criterion. Extensive analysis and experiments on Labeled Face in the Wild (LFW), Youtube Faces (YTF) and MegaFace Challenge show the superiority of A-Softmax loss in FR tasks. The code has also been made publicly available 1 .
translated by 谷歌翻译
Person re-identification is a challenging task because of the high intra-class variance induced by the unrestricted nuisance factors of variations such as pose, illumination, viewpoint, background, and sensor noise. Recent approaches postulate that powerful architectures have the capacity to learn feature representations invariant to nuisance factors, by training them with losses that minimize intra-class variance and maximize inter-class separation, without modeling nuisance factors explicitly. The dominant approaches use either a discriminative loss with margin, like the softmax loss with the additive angular margin, or a metric learning loss, like the triplet loss with batch hard mining of triplets. Since the softmax imposes feature normalization, it limits the gradient flow supervising the feature embedding. We address this by joining the losses and leveraging the triplet loss as a proxy for the missing gradients. We further improve invariance to nuisance factors by adding the discriminative task of predicting attributes. Our extensive evaluation highlights that when only a holistic representation is learned, we consistently outperform the state-of-the-art on the three most challenging datasets. Such representations are easier to deploy in practical systems. Finally, we found that joining the losses removes the requirement for having a margin in the softmax loss while increasing performance.
translated by 谷歌翻译
Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.
translated by 谷歌翻译
卷积神经网络(CNNS)在监督环境中的影响提供了巨大的性能。从CNN中学到的表示,在高度球形歧管上运作,导致了面部识别,面部识别和其他受监督任务的富有魅力结果。具有广泛的激活功能,具有间直觉,在欧几里德空间中执行优于Softmax。这项研究的主要动力是提供见解。首先,暗示立体图投影以将数据从欧几里德空间($ \ mathbb {r} ^ {n} $)转换为高度球形歧管($ \ mathbb {s} ^ {n} $)来分析角度边缘损失的性能。其次,从理论上证明了使用立体投影在极度上构建的决策边界义务授权了神经网络的学习。实验已经证明,在现有的最先进的角度边缘目标功能上应用立体摄影改善了标准图像分类数据集的性能(CIFAR-10,100)。此外,我们在疟疾薄血涂片图像上运行了我们的实验,导致有效的结果。该代码可公开可用:https://github.com/barulalithb/stereo -angular-margin。
translated by 谷歌翻译
课程学习需要示例难以从轻松到硬进行。但是,很少研究图像难度的信誉,这会严重影响课程的有效性。在这项工作中,我们提出了角度差距,这是基于特征嵌入和通过超球体学习构建的类别嵌入和类体重嵌入的角度差异的难度度量。为了确定难度估计,我们将按班级模型校准作为培训后技术引入学习的双曲线空间。这弥合了概率模型校准与超透明学习的角度距离估计之间的差距。我们显示了校准的角度差距的优越性,而不是最近在CIFAR10-H和ImagenEtV2上的难度指标。我们进一步提出了基于角度间隙的课程学习,以进行无监督的域适应性,从而可以从学习简易样品转化为采矿硬样品。我们将该课程与最先进的自我训练方法(CST)相结合。拟议的课程CST学习了强大的表示形式,并且在Office31和Visda 2017上的最新基准都优于最近的基线。
translated by 谷歌翻译
径向基函数神经网络(RBF)是用于模式分类和回归的主要候选者,并且已在经典的机器学习应用中广泛使用。但是,由于缺乏现代体系结构的适应性,RBF尚未使用常规卷积神经网络(CNN)纳入当代深度学习研究和计算机视觉。在本文中,我们通过修改训练过程并引入新的激活功能来训练现代视觉体系结构端到端以端对端进行图像分类,从而将RBF网络作为分类器将作为分类器。 RBF的特定架构使学习相似性距离度量可以比较和查找相似和不同的图像。此外,我们证明,在任何CNN体系结构上使用RBF分类器都提供了有关模型决策过程的新的人性化洞察力。最后,我们成功地将RBF应用于一系列CNN体系结构,并在基准计算机视觉数据集上评估结果。
translated by 谷歌翻译
在多类分类模型的现实应用应用中,重要类中的错误分类(例如停止符号)可能比其他类别(例如速度限制)更有危害。在本文中,我们提出了一个损失函数,可以改善重要类别的回忆,同时使用跨透镜损失保持与情况相同的准确性。出于我们的目的,我们需要比其他班级更好地分离重要班级。但是,现有的方法对跨凝性损失造成较敏感的惩罚并不能改善分离。另一方面,给出特征向量与与每个特征相对应的最后一个完全连接层的重量向量之间的角度的方法可以改善分离。因此,我们提出了一个损失函数,可以通过仅设置重要类别的边缘来改善重要类别的分离,即称为类敏感的添加性角度损失(CAMRI损失)。预计CAMRI的损失将减少重要类的特征和权重之间的角度方差相对于其他类别,这是由于特征空间中重要类周围的边缘通过为角度增加惩罚而在特征空间中的边缘。此外,仅将惩罚集中在重要类别上几乎不会牺牲其他阶级的分离。在CIFAR-10,GTSRB和AWA2上进行的实验表明,所提出的方法可以在不牺牲准确性的情况下改善跨透镜损失的召回率提高了9%。
translated by 谷歌翻译
使用超越欧几里德距离的神经网络,深入的Bregman分歧测量数据点的分歧,并且能够捕获分布的发散。在本文中,我们提出了深深的布利曼对视觉表现的对比学习的分歧,我们的目标是通过基于功能Bregman分歧培训额外的网络来提高自我监督学习中使用的对比损失。与完全基于单点之间的分歧的传统对比学学习方法相比,我们的框架可以捕获分布之间的发散,这提高了学习表示的质量。我们展示了传统的对比损失和我们提出的分歧损失优于基线的结合,并且最先前的自我监督和半监督学习的大多数方法在多个分类和对象检测任务和数据集中。此外,学习的陈述在转移到其他数据集和任务时概括了良好。源代码和我们的型号可用于补充,并将通过纸张释放。
translated by 谷歌翻译
深度度量学习算法旨在学习有效的嵌入空间,以保持输入数据之间的相似性关系。尽管这些算法在广泛的任务中取得了显着的性能增长,但它们也未能考虑并增加全面的相似性约束。因此,在嵌入空间中学习了亚最佳度量。而且,到目前为止;关于它们在嘈杂标签的存在方面的研究很少。在这里,我们通过设计一个新颖而有效的深层差异损失(DCDL)功能来解决学习歧视性深层嵌入空间的关注和每个班级。在存在和没有噪声的情况下,我们在三个标准图像分类数据集和两个细粒图像识别数据集中的经验结果清楚地表明,在学习歧视性嵌入空间的同时,需要将这种类似的相似性关系以及传统算法结合在一起。
translated by 谷歌翻译
Model bias triggered by long-tailed data has been widely studied. However, measure based on the number of samples cannot explicate three phenomena simultaneously: (1) Given enough data, the classification performance gain is marginal with additional samples. (2) Classification performance decays precipitously as the number of training samples decreases when there is insufficient data. (3) Model trained on sample-balanced datasets still has different biases for different classes. In this work, we define and quantify the semantic scale of classes, which is used to measure the feature diversity of classes. It is exciting to find experimentally that there is a marginal effect of semantic scale, which perfectly describes the first two phenomena. Further, the quantitative measurement of semantic scale imbalance is proposed, which can accurately reflect model bias on multiple datasets, even on sample-balanced data, revealing a novel perspective for the study of class imbalance. Due to the prevalence of semantic scale imbalance, we propose semantic-scale-balanced learning, including a general loss improvement scheme and a dynamic re-weighting training framework that overcomes the challenge of calculating semantic scales in real-time during iterations. Comprehensive experiments show that dynamic semantic-scale-balanced learning consistently enables the model to perform superiorly on large-scale long-tailed and non-long-tailed natural and medical datasets, which is a good starting point for mitigating the prevalent but unnoticed model bias.
translated by 谷歌翻译
最大化类之间的分离构成了机器学习中众所周知的归纳偏见和许多传统算法的支柱。默认情况下,深网不配备这种电感偏差,因此通过差异优化提出了许多替代解决方案。当前的方法倾向于共同优化分类和分离:将输入与类向量对齐,并角度分离载体。本文提出了一个简单的替代方法:通过在计算SoftMax激活之前添加一个固定的矩阵乘法,将最大分离作为网络中的电感偏差编码。我们方法背后的主要观察结果是,分离不需要优化,可以在训练之前以封闭形式解决并插入网络。我们概述了一种递归方法,以获取由任何数量类别的最大可分离矢量组成的矩阵,可以通过可忽略的工程工作和计算开销添加。尽管它的性质很简单,但这个矩阵乘法提供了真正的影响。我们表明,我们的建议直接提高分类,长尾识别,分布式检测和开放式识别,从CIFAR到Imagenet。我们从经验上发现,最大分离最有效地作为固定偏见。使矩阵可学习不会增加表现。在GitHub上,封闭形式的实现和代码是在GitHub上。
translated by 谷歌翻译
开放式识别使深度神经网络(DNN)能够识别未知类别的样本,同时在已知类别的样本上保持高分类精度。基于自动编码器(AE)和原型学习的现有方法在处理这项具有挑战性的任务方面具有巨大的潜力。在这项研究中,我们提出了一种新的方法,称为类别特定的语义重建(CSSR),该方法整合了AE和原型学习的力量。具体而言,CSSR用特定于类的AE表示的歧管替代了原型点。与传统的基于原型的方法不同,CSSR在单个AE歧管上的每个已知类模型,并通过AE的重建误差来测量类归属感。特定于类的AE被插入DNN主链的顶部,并重建DNN而不是原始图像所学的语义表示。通过端到端的学习,DNN和AES互相促进,以学习歧视性和代表性信息。在多个数据集上进行的实验结果表明,所提出的方法在封闭式和开放式识别中都达到了出色的性能,并且非常简单且灵活地将其纳入现有框架中。
translated by 谷歌翻译
可以通过对手动预定义目标的监督(例如,一hot或Hadamard代码)进行深入的表示学习来解决细粒度的视觉分类。这种目标编码方案对于模型间相关性的灵活性较小,并且对稀疏和不平衡的数据分布也很敏感。鉴于此,本文介绍了一种新颖的目标编码方案 - 动态目标关系图(DTRG),作为辅助特征正则化,是一个自生成的结构输出,可根据输入图像映射。具体而言,类级特征中心的在线计算旨在在表示空间中生成跨类别距离,因此可以通过非参数方式通过动态图来描绘。明确最大程度地减少锚定在这些级别中心的阶层内特征变化可以鼓励学习判别特征。此外,由于利用了类间的依赖性,提出的目标图可以减轻代表学习中的数据稀疏性和不稳定。受混合风格数据增强的最新成功的启发,本文将随机性引入了动态目标关系图的软结构,以进一步探索目标类别的关系多样性。实验结果可以证明我们方法对多个视觉分类任务的许多不同基准的有效性,尤其是在流行的细粒对象基准上实现最先进的性能以及针对稀疏和不平衡数据的出色鲁棒性。源代码可在https://github.com/akonlau/dtrg上公开提供。
translated by 谷歌翻译
视觉世界中新对象的不断出现对现实世界部署中当前的深度学习方法构成了巨大的挑战。由于稀有性或成本,新任务学习的挑战通常会加剧新类别的数据。在这里,我们探讨了几乎没有类别学习的重要任务(FSCIL)及其极端数据稀缺条件。理想的FSCIL模型都需要在所有类别上表现良好,无论其显示顺序或数据的匮乏。开放式现实世界条件也需要健壮,并可以轻松地适应始终在现场出现的新任务。在本文中,我们首先重新评估当前的任务设置,并为FSCIL任务提出更全面和实用的设置。然后,受到FSCIL和现代面部识别系统目标的相似性的启发,我们提出了我们的方法 - 增强角损失渐进分类或爱丽丝。在爱丽丝(Alice)中,我们建议使用角度损失损失来获得良好的特征。由于所获得的功能不仅需要紧凑,而且还需要足够多样化以维持未来的增量类别的概括,我们进一步讨论了类增强,数据增强和数据平衡如何影响分类性能。在包括CIFAR100,Miniimagenet和Cub200在内的基准数据集上的实验证明了爱丽丝在最新的FSCIL方法上的性能提高。
translated by 谷歌翻译