Deep embeddings answer one simple question: How similar are two images?Learning these embeddings is the bedrock of verification, zero-shot learning,and visual search. The most prominent approaches optimize a deep convolutionalnetwork with a suitable loss function, such as contrastive loss or tripletloss. While a rich line of work focuses solely on the loss functions, we showin this paper that selecting training examples plays an equally important role.We propose distance weighted sampling, which selects more informative andstable examples than traditional approaches. In addition, we show that a simplemargin based loss is sufficient to outperform all other loss functions. Weevaluate our approach on the Stanford Online Products, CAR196, and theCUB200-2011 datasets for image retrieval and clustering, and on the LFW datasetfor face verification. Our method achieves state-of-the-art performance on allof them.
translated by 谷歌翻译