CVNov 20, 2015

Semantic Diversity versus Visual Diversity in Visual Dictionaries

arXiv:1511.06704v1
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of creating efficient visual dictionaries for large image collections, offering a way to reduce the burden by using visually diverse subsets instead of full collections.

The paper investigated whether semantic diversity or visual diversity is more important for building visual dictionaries in bag-of-visual-words models, finding that visual diversity is more critical for image classification performance.

Visual dictionaries are a critical component for image classification/retrieval systems based on the bag-of-visual-words (BoVW) model. Dictionaries are usually learned without supervision from a training set of images sampled from the collection of interest. However, for large, general-purpose, dynamic image collections (e.g., the Web), obtaining a representative sample in terms of semantic concepts is not straightforward. In this paper, we evaluate the impact of semantics in the dictionary quality, aiming at verifying the importance of semantic diversity in relation visual diversity for visual dictionaries. In the experiments, we vary the amount of classes used for creating the dictionary and then compute different BoVW descriptors, using multiple codebook sizes and different coding and pooling methods (standard BoVW and Fisher Vectors). Results for image classification show that as visual dictionaries are based on low-level visual appearances, visual diversity is more important than semantic diversity. Our conclusions open the opportunity to alleviate the burden in generating visual dictionaries as we need only a visually diverse set of images instead of the whole collection to create a good dictionary.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes