CVJul 4, 2025

On the rankability of visual embeddings

Ankit Sonthalia, Arnas Uselis, Seong Joon Oh

arXiv:2507.03683v111.84 citationsh-index: 5Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of efficiently ranking images by attributes for applications in vector databases, though it is incremental in exploring existing embedding structures.

The paper investigates whether visual embedding models inherently capture continuous attributes like age or aesthetics along linear directions, termed rank axes, and finds that many popular encoders are rankable across datasets, with minimal supervision needed to recover meaningful axes.

We study whether visual embedding models capture continuous, ordinal attributes along linear directions, which we term _rank axes_. We define a model as _rankable_ for an attribute if projecting embeddings onto such an axis preserves the attribute's order. Across 7 popular encoders and 9 datasets with attributes like age, crowd count, head pose, aesthetics, and recency, we find that many embeddings are inherently rankable. Surprisingly, a small number of samples, or even just two extreme examples, often suffice to recover meaningful rank axes, without full-scale supervision. These findings open up new use cases for image ranking in vector databases and motivate further study into the structure and learning of rankable embeddings. Our code is available at https://github.com/aktsonthalia/rankable-vision-embeddings.

View on arXiv PDF Code

Similar