CVJul 8, 2014

Orientation covariant aggregation of local descriptors with embeddings

arXiv:1407.2170v238 citations
AI Analysis

This work addresses a specific issue in computer vision for image retrieval, offering an incremental improvement over existing methods.

The paper tackles the problem of excessive invariance in image search systems by introducing an orientation covariant aggregation strategy for local descriptors, which improves retrieval performance on standard benchmarks like Holidays and Oxford buildings.

Image search systems based on local descriptors typically achieve orientation invariance by aligning the patches on their dominant orientations. Albeit successful, this choice introduces too much invariance because it does not guarantee that the patches are rotated consistently. This paper introduces an aggregation strategy of local descriptors that achieves this covariance property by jointly encoding the angle in the aggregation stage in a continuous manner. It is combined with an efficient monomial embedding to provide a codebook-free method to aggregate local descriptors into a single vector representation. Our strategy is also compatible and employed with several popular encoding methods, in particular bag-of-words, VLAD and the Fisher vector. Our geometric-aware aggregation strategy is effective for image search, as shown by experiments performed on standard benchmarks for image and particular object retrieval, namely Holidays and Oxford buildings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes