深度指标学习旨在学习嵌入空间,即使在训练期间他们的类是看不见的,数据之间的距离反映了他们的类等价。然而,培训中可用的有限数量排除了学习嵌入空间的概括。由此激励,我们介绍了一种新的数据增强方法,该方法合成了新颖类及其嵌入向量。我们的方法可以向嵌入式模型提供丰富的语义信息,通过在原始数据中使用新类别增强培训数据来提高其泛化。我们通过学习和利用条件生成模型来实现这个想法,其中,给定类标签和噪声,产生类的随机嵌入向量。我们所提出的发电机允许损失通过增强现实和多样的类来使用更丰富的级关系,从而更好地推广了看不见的样本。公共基准数据集上的实验结果表明,我们的方法明确提高了基于代理的损失的性能。
translated by 谷歌翻译
Deep metric learning aims to learn an embedding space, where semantically similar samples are close together and dissimilar ones are repelled against. To explore more hard and informative training signals for augmentation and generalization, recent methods focus on generating synthetic samples to boost metric learning losses. However, these methods just use the deterministic and class-independent generations (e.g., simple linear interpolation), which only can cover the limited part of distribution spaces around original samples. They have overlooked the wide characteristic changes of different classes and can not model abundant intra-class variations for generations. Therefore, generated samples not only lack rich semantics within the certain class, but also might be noisy signals to disturb training. In this paper, we propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning. We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining and boost metric learning losses. Further, for most datasets that have a few samples within the class, we propose the neighbor correction to revise the inaccurate estimations, according to our correlation discovery where similar classes generally have similar variation distributions. Extensive experiments on five benchmarks show our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%. Our code is available at https://github.com/darkpromise98/IAA
translated by 谷歌翻译
Recent methods for deep metric learning have been focusing on designing different contrastive loss functions between positive and negative pairs of samples so that the learned feature embedding is able to pull positive samples of the same class closer and push negative samples from different classes away from each other. In this work, we recognize that there is a significant semantic gap between features at the intermediate feature layer and class labels at the final output layer. To bridge this gap, we develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity in a contrastive learning setting. This contrastive Bayesian analysis leads to a new loss function for deep metric learning. To improve the generalization capability of the proposed method onto new classes, we further extend the contrastive Bayesian loss with a metric variance constraint. Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning in both supervised and pseudo-supervised scenarios, outperforming existing methods by a large margin.
translated by 谷歌翻译
深度度量学习(DML)模型通常需要强大的本地和全球表示,但是,DML模型培训中的本地和全球特征的有效整合是一项挑战。 DML模型通常具有特定损耗功能,包括基于成对和基于代理的损失。基于成对的损耗函数利用数据点之间丰富的语义关系,然而,在DML模型训练期间经常遭受缓慢的收敛。另一方面,基于代理的损耗功能通常会导致培训期间收敛的显着加速,而基于代理的损失通常不会完全探索数据点之间的丰富关系。在本文中,我们提出了一种新的DML方法来解决这些挑战。所提出的DML方法通过集成对基于基于代理的损耗函数来利用丰富的数据到数据关系以及快速收敛来利用混合丢失来利用混合丢失。此外,所提出的DML方法利用全局和本地功能在DML模型培训中获得丰富的表示。最后,我们还使用二阶注意功能增强,以提高准确和有效的检索。在我们的实验中,我们在四个公共基准中广泛评估了所提出的DML方法,实验结果表明,该方法在所有基准上实现了最先进的性能。
translated by 谷歌翻译
Supervision for metric learning has long been given in the form of equivalence between human-labeled classes. Although this type of supervision has been a basis of metric learning for decades, we argue that it hinders further advances of the field. In this regard, we propose a new regularization method, dubbed HIER, to discover the latent semantic hierarchy of training data, and to deploy the hierarchy to provide richer and more fine-grained supervision than inter-class separability induced by common metric learning losses. HIER achieved this goal with no annotation for the semantic hierarchy but by learning hierarchical proxies in hyperbolic spaces. The hierarchical proxies are learnable parameters, and each of them is trained to serve as an ancestor of a group of data or other proxies to approximate the semantic hierarchy among them. HIER deals with the proxies along with data in hyperbolic space since geometric properties of the space are well-suited to represent their hierarchical structure. The efficacy of HIER was evaluated on four standard benchmarks, where it consistently improved performance of conventional methods when integrated with them, and consequently achieved the best records, surpassing even the existing hyperbolic metric learning technique, in almost all settings.
translated by 谷歌翻译
深度度量学习(DML)有助于学习嵌入功能,以将语义上的数据投射到附近的嵌入空间中,并在许多应用中起着至关重要的作用,例如图像检索和面部识别。但是,DML方法的性能通常很大程度上取决于采样方法,从训练中的嵌入空间中选择有效的数据。实际上,嵌入空间中的嵌入是通过一些深层模型获得的,其中嵌入空间通常由于缺乏训练点而在贫瘠的区域中,导致所谓的“缺失嵌入”问题。此问题可能会损害样品质量,从而导致DML性能退化。在这项工作中,我们研究了如何减轻“缺失”问题以提高采样质量并实现有效的DML。为此,我们提出了一个密集锚定的采样(DAS)方案,该方案将嵌入的数据点视为“锚”,并利用锚附近的嵌入空间来密集地生成无数据点的嵌入。具体而言,我们建议用判别性特征缩放(DFS)和多个锚点利用单个锚周围的嵌入空间,并具有记忆转换转换(MTS)。通过这种方式,通过有或没有数据点的嵌入方式,我们能够提供更多的嵌入以促进采样过程,从而提高DML的性能。我们的方法毫不费力地集成到现有的DML框架中,并在没有铃铛和哨声的情况下改进了它们。在三个基准数据集上进行的广泛实验证明了我们方法的优势。
translated by 谷歌翻译
在本文中,我们提出了一种强大的样本生成方案来构建信息性三联网。所提出的硬样品生成是一种两级合成框架,通过两个阶段的有效正和负样品发生器产生硬样品。第一阶段将锚定向对具有分段线性操作,通过巧妙地设计条件生成的对抗网络来提高产生的样本的质量,以降低模式崩溃的风险。第二阶段利用自适应反向度量约束来生成最终的硬样本。在几个基准数据集上进行广泛的实验,验证了我们的方法比现有的硬样生成算法达到卓越的性能。此外,我们还发现,我们建议的硬样品生成方法结合现有的三态挖掘策略可以进一步提高深度度量学习性能。
translated by 谷歌翻译
深度度量学习(DML)旨在最大程度地减少嵌入图像中成对内部/间阶层接近性违规的经验预期损失。我们将DML与有限机会限制的可行性问题联系起来。我们表明,基于代理的DML的最小化器满足了某些机会限制,并且基于代理方法的最坏情况可以通过围绕类代理的最小球的半径来表征,以覆盖相应类的整个域样本,建议每课多个代理有助于表现。为了提供可扩展的算法并利用更多代理,我们考虑了基于代理的DML实例的最小化者所隐含的机会限制,并将DML重新制定为在此类约束的交叉点中找到可行的点,从而导致问题近似解决。迭代预测。简而言之,我们反复训练基于代理的损失,并用故意选择的新样本的嵌入来重新定位代理。我们将我们的方法应用于公认的损失,并在四个流行的基准数据集上评估图像检索。优于最先进的方法,我们的方法一致地提高了应用损失的性能。代码可在以下网址找到:https://github.com/yetigurbuz/ccp-dml
translated by 谷歌翻译
Suffering from the extreme training data imbalance between seen and unseen classes, most of existing state-of-theart approaches fail to achieve satisfactory results for the challenging generalized zero-shot learning task. To circumvent the need for labeled examples of unseen classes, we propose a novel generative adversarial network (GAN) that synthesizes CNN features conditioned on class-level semantic information, offering a shortcut directly from a semantic descriptor of a class to a class-conditional feature distribution. Our proposed approach, pairing a Wasserstein GAN with a classification loss, is able to generate sufficiently discriminative CNN features to train softmax classifiers or any multimodal embedding method. Our experimental resultsdemonstrate a significant boost in accuracy over the state of the art on five challenging datasets -CUB, FLO, SUN, AWA and ImageNet -in both the zero-shot learning and generalized zero-shot learning settings.
translated by 谷歌翻译
在距离度量学习网络的培训期间,典型损耗函数的最小值可以被认为是满足由训练数据施加的一组约束的“可行点”。为此,我们将距离度量学习问题重构为查找约束集的可行点,其中训练数据的嵌入向量满足所需的类内和帧间接近度。由约束集引起的可行性集被表示为仅针对训练数据的特定样本(来自每个类别的样本)强制执行接近约束的宽松可行集合。然后,通过在那些可行的组上执行交替的投影来大致解决可行点问题。这种方法引入了正则化术语,并导致最小化具有系统批量组结构的典型损失函数,其中这些批次被约束以包含来自每个类的相同样本,用于一定数量的迭代。此外,这些特定样品可以被认为是阶级代表,允许在批量构建期间有效地利用艰难的挖掘。所提出的技术应用于良好的损失,并在斯坦福在线产品,CAR196和CUB200-2011数据集进行了评估,用于图像检索和聚类。表现优于现有技术,所提出的方法一致地提高了综合损失函数的性能,没有额外的计算成本,并通过硬负面挖掘进一步提高性能。
translated by 谷歌翻译
Deep Metric Learning (DML) learns a non-linear semantic embedding from input data that brings similar pairs together while keeping dissimilar data away from each other. To this end, many different methods are proposed in the last decade with promising results in various applications. The success of a DML algorithm greatly depends on its loss function. However, no loss function is perfect, and it deals only with some aspects of an optimal similarity embedding. Besides, the generalizability of the DML on unseen categories during the test stage is an important matter that is not considered by existing loss functions. To address these challenges, we propose novel approaches to combine different losses built on top of a shared deep feature extractor. The proposed ensemble of losses enforces the deep model to extract features that are consistent with all losses. Since the selected losses are diverse and each emphasizes different aspects of an optimal semantic embedding, our effective combining methods yield a considerable improvement over any individual loss and generalize well on unseen categories. Here, there is no limitation in choosing loss functions, and our methods can work with any set of existing ones. Besides, they can optimize each loss function as well as its weight in an end-to-end paradigm with no need to adjust any hyper-parameter. We evaluate our methods on some popular datasets from the machine vision domain in conventional Zero-Shot-Learning (ZSL) settings. The results are very encouraging and show that our methods outperform all baseline losses by a large margin in all datasets.
translated by 谷歌翻译
大多数深度度量学习(DML)方法采用了一种策略,该策略迫使所有积极样本在嵌入空间中靠近,同时使它们远离负面样本。但是,这种策略忽略了正(负)样本的内部关系,并且通常导致过度拟合,尤其是在存在硬样品和标签错误的情况下。在这项工作中,我们提出了一个简单而有效的正则化,即列表自我验证(LSD),该化逐渐提炼模型的知识,以适应批处理中每个样本对的更合适的距离目标。LSD鼓励在正(负)样本中更平稳的嵌入和信息挖掘,以减轻过度拟合并从而改善概括。我们的LSD可以直接集成到一般的DML框架中。广泛的实验表明,LSD始终提高多个数据集上各种度量学习方法的性能。
translated by 谷歌翻译
零拍学习(ZSL)旨在识别培训时间没有可视化样本的类。要解决此问题,可以依赖每个类的语义描述。典型的ZSL模型学习所看到的类和相应的语义描述的视觉样本之间的映射,以便在测试时间的看不见的类上对此进行操作。最先进的方法依赖于从类的原型合成视觉特征的生成模型,从而可以以监督方式学习分类器。但是,这些方法通常偏向于所看到的类,其视觉实例是唯一可以与给定类原型匹配的类。我们提出了一种正规化方法,可以应用于任何条件生成的ZSL方法,只能利用语义类原型。它学会综合判断特征,以便在训练时间不可用的可能语义描述,即看不见的特征。在文献中常用的四个数据集中评估该方法,其在文献中通常用于感应和转换设置,结果对杠杆或上述现有方法的结果。
translated by 谷歌翻译
深度度量学习(DML)了解映射,该映射到嵌入空间,其中类似数据接近并且不同的数据远远。然而,DML的传统基于代理的损失有两个问题:渐变问题并使用多个本地中心应用现实世界数据集。此外,DML性能指标也有一些问题具有稳定性和灵活性。本文提出了多代理锚(MPA)丢失和归一化折扣累积增益(NDCG @ K)度量。本研究贡献了三个以下:(1)MPA损失能够使用多代理学习现实世界数据集。(2)MPA损失提高了神经网络的培训能力,解决了梯度问题。(3)NDCG @ K度量标准鼓励对各种数据集进行全面评估。最后,我们展示了MPA损失的有效性,MPA损失在两个用于细粒度图像的数据集上实现了最高准确性。
translated by 谷歌翻译
零拍摄学习(ZSL)旨在将知识从看见课程转移到语义相关的看不见的看不见的类,这在训练期间不存在。 ZSL的有希望的策略是在语义侧信息中综合未经调节的视野类的视觉特征,并结合元学习,以消除模型对所看到的课程的固有偏差。虽然现有的元生成方法追求跨任务分布的共同模型,但我们的目标是构建适应任务特征的生成网络。为此,我们提出了一个属性调制的生成元模型,用于零射击学习(Amaz)。我们的模型包括属性感知调制网络,属性增强生成网络和属性加权分类器。给定看不见的类,调制网络通过应用特定任务的变换自适应地调制发电机,使得生成网络可以适应高度多样化的任务。加权分类器利用数据质量来增强培训过程,进一步提高模型性能。我们对四种广泛使用的基准测试的实证评估表明,Amaz优先效仿最先进的方法在ZSL和广义ZSL设置中,展示了我们方法的优越性。我们对零拍摄图像检索任务的实验表明了Amaz的合成描绘真实视觉特征的情况的能力。
translated by 谷歌翻译
Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller. Previous approaches can be expressed as a form of training the student to mimic output activations of individual data examples represented by the teacher. We introduce a novel approach, dubbed relational knowledge distillation (RKD), that transfers mutual relations of data examples instead. For concrete realizations of RKD, we propose distance-wise and angle-wise distillation losses that penalize structural differences in relations. Experiments conducted on different tasks show that the proposed method improves educated student models with a significant margin. In particular for metric learning, it allows students to outperform their teachers' performance, achieving the state of the arts on standard benchmark datasets.
translated by 谷歌翻译
基于正规化的方法有利于缓解类渐进式学习中的灾难性遗忘问题。由于缺乏旧任务图像,如果分类器在新图像上产生类似的输出,它们通常会假设旧知识得到很好的保存。在本文中,我们发现他们的效果很大程度上取决于旧课程的性质:它们在彼此之间容易区分的课程上工作,但可能在更细粒度的群体上失败,例如,男孩和女孩。在SPIRIT中,此类方法将新数据项目投入到完全连接层中的权重向量中跨越的特征空间,对应于旧类。由此产生的预测在细粒度的旧课程上是相似的,因此,新分类器将逐步失去这些课程的歧视能力。为了解决这个问题,我们提出了一种无记忆生成的重播策略,通过直接从旧分类器生成代表性的旧图像并结合新的分类器培训的新数据来保留细粒度的旧阶级特征。为了解决所产生的样本的均化问题,我们还提出了一种分集体损失,使得产生的样品之间的Kullback Leibler(KL)发散。我们的方法最好是通过先前的基于正规化的方法补充,证明是为了易于区分的旧课程有效。我们验证了上述关于CUB-200-2011,CALTECH-101,CIFAR-100和微小想象的设计和见解,并表明我们的策略优于现有的无记忆方法,并具有清晰的保证金。代码可在https://github.com/xmengxin/mfgr获得
translated by 谷歌翻译
A family of loss functions built on pair-based computation have been proposed in the literature which provide a myriad of solutions for deep metric learning. In this paper, we provide a general weighting framework for understanding recent pair-based loss functions. Our contributions are three-fold: (1) we establish a General Pair Weighting (GPW) framework, which casts the sampling problem of deep metric learning into a unified view of pair weighting through gradient analysis, providing a powerful tool for understanding recent pair-based loss functions; (2) we show that with GPW, various existing pair-based methods can be compared and discussed comprehensively, with clear differences and key limitations identified; (3) we propose a new loss called multi-similarity loss (MS loss) under the GPW, which is implemented in two iterative steps (i.e., mining and weighting). This allows it to fully consider three similarities for pair weighting, providing a more principled approach for collecting and weighting informative pairs. Finally, the proposed MS loss obtains new state-of-the-art performance on four image retrieval benchmarks, where it outperforms the most recent approaches, such as ABE [14] and HTL [4], by a large margin, e.g., , and 80.9% → 88.0% on In-Shop Clothes Retrieval dataset
translated by 谷歌翻译
深度度量学习算法旨在学习有效的嵌入空间,以保持输入数据之间的相似性关系。尽管这些算法在广泛的任务中取得了显着的性能增长,但它们也未能考虑并增加全面的相似性约束。因此,在嵌入空间中学习了亚最佳度量。而且,到目前为止;关于它们在嘈杂标签的存在方面的研究很少。在这里,我们通过设计一个新颖而有效的深层差异损失(DCDL)功能来解决学习歧视性深层嵌入空间的关注和每个班级。在存在和没有噪声的情况下,我们在三个标准图像分类数据集和两个细粒图像识别数据集中的经验结果清楚地表明,在学习歧视性嵌入空间的同时,需要将这种类似的相似性关系以及传统算法结合在一起。
translated by 谷歌翻译
嘈杂的标签通常在现实世界数据中找到,这导致深神经网络的性能下降。手动清洁数据是劳动密集型和耗时的。以前的研究主要侧重于加强对嘈杂标签的分类模型,而对嘈杂标签的深度度量学习(DML)的鲁棒性仍然较少。在本文中,通过提出与DML的内存(棱镜)方法提出基于概率排名的实例选择来弥合这一重要差异。棱镜计算清洁标签的概率,并滤除潜在的噪声样本。具体地,我们提出了一种新方法,即Von Mises-Fisher分配相似性(VMF-SIM),通过估计每个数据类的VON MISES-FISHER(VMF)分布来计算这种概率。与现有的平均相似性方法(AVGSIM)相比,除了平均相似度之外,VMF-SIM还考虑每个类的方差。通过这种设计,所提出的方法可以应对挑战的DML情况,其中大多数样本是嘈杂的。在合成和现实世界嘈杂的数据集中的广泛实验表明,拟议的方法在合理的培训时间内实现了高达@ 1的精度高达8.37%的精度@ 1。
translated by 谷歌翻译