State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models. Yet, ImageNet is now nearly ten years old and is by modern standards "small". Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger. The reasons are obvious: such datasets are difficult to collect and annotate. In this paper, we present a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images. Our experiments demonstrate that training for large-scale hashtag prediction leads to excellent results. We show improvements on several image classification and object detection tasks, and report the highest ImageNet-1k single-crop, top-1 accuracy to date: 85.4% (97.6% top-5). We also perform extensive experiments that provide novel empirical data on the relationship between large-scale pretraining and transfer learning performance. Name template Description train-IG-I-1.5k Instagram training set of I images and ∼1.5k hashtags from ImageNet-1k. train-IG-I-8.5k Instagram training set of I images and ∼8.5k hashtags from WordNet. train-IG-I-17k Instagram training set of I images and ∼17k hashtags from WordNet. train-IN-1M-1k The standard ImageNet-1k ILSVRC training set with 1.28M images. val-IN-50k-1k The standard ImageNet-1k ILSVRC validation set with 50k images. train-IN-I-L Extended ImageNet training set of I images and L ∈ {5k, 9k} labels. val-IN-I-L Extended ImageNet validation set of I images and L ∈ {5k, 9k} labels. train-CUB-6k-200 The Caltech-UCSD Birds-200-2011 training set. val-CUB-6k-200 The Caltech-UCSD Birds-200-2011 validation set. train-Places-1.8M-365 The Places365-Standard training set (high-resolution version). val-Places-37k-365 The Places365-Standard validation set (high-resolution version). train-COCO-135k-80 The standard COCO detection training set (2017 version). val-COCO-5k-80 The standard COCO detection validation set (2017 version). test-COCO-20k-80 The standard COCO detection test-dev set (2017 version).Table 1: Summary of image classification datasets. Each dataset is named with a template, role-source-I-L, that indicates its role (training, validation, testing), source, number of images I, and number of labels L.
translated by 谷歌翻译