CVAIIRLGSep 26, 2018

Hierarchy-based Image Embeddings for Semantic Image Retrieval

arXiv:1809.09924v4120 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more semantically meaningful image representations in retrieval tasks, offering a deterministic method that could benefit domains like novelty detection or few-shot learning, though it is incremental as it builds on existing class hierarchy knowledge.

The paper tackles the problem of visual similarity not implying semantic similarity in image retrieval by mapping images onto class embeddings derived from a hierarchy like WordNet, resulting in large-margin improvements in semantic consistency across CIFAR-100, NABirds, and ImageNet datasets.

Deep neural networks trained for classification have been found to learn powerful image representations, which are also often used for other tasks such as comparing images w.r.t. their visual similarity. However, visual similarity does not imply semantic similarity. In order to learn semantically discriminative features, we propose to map images onto class embeddings whose pair-wise dot products correspond to a measure of semantic similarity between classes. Such an embedding does not only improve image retrieval results, but could also facilitate integrating semantics for other tasks, e.g., novelty detection or few-shot learning. We introduce a deterministic algorithm for computing the class centroids directly based on prior world-knowledge encoded in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds, and ImageNet show that our learned semantic image embeddings improve the semantic consistency of image retrieval results by a large margin.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes