CVLGMLJul 19, 2016

Information-theoretical label embeddings for large-scale image classification

arXiv:1607.05691v117 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient and accurate large-scale image classification for applications like content tagging or search, though it appears incremental as it builds on existing embedding and regression techniques.

The paper tackles the problem of training multi-label, multi-class image classification models by embedding high-dimensional sparse labels onto a lower-dimensional dense sphere and using cosine proximity regression, resulting in faster convergence and a 7% higher mean average precision compared to logistic regression on a dataset of 300 million images with 17,000 labels.

We present a method for training multi-label, massively multi-class image classification models, that is faster and more accurate than supervision via a sigmoid cross-entropy loss (logistic regression). Our method consists in embedding high-dimensional sparse labels onto a lower-dimensional dense sphere of unit-normed vectors, and treating the classification problem as a cosine proximity regression problem on this sphere. We test our method on a dataset of 300 million high-resolution images with 17,000 labels, where it yields considerably faster convergence, as well as a 7% higher mean average precision compared to logistic regression.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes