CVMar 17, 2025

All You Need to Know About Training Image Retrieval Models

arXiv:2503.13045v12 citationsh-index: 11Has Code
Originality Synthesis-oriented
AI Analysis

This work provides practical guidelines for researchers and practitioners in computer vision to optimize image retrieval models, though it is incremental as it synthesizes existing knowledge rather than introducing new methods.

The authors conducted extensive experiments to analyze the impact of various training factors on image retrieval accuracy, identifying best practices that generalize across multiple datasets.

Image retrieval is the task of finding images in a database that are most similar to a given query image. The performance of an image retrieval pipeline depends on many training-time factors, including the embedding model architecture, loss function, data sampler, mining function, learning rate(s), and batch size. In this work, we run tens of thousands of training runs to understand the effect each of these factors has on retrieval accuracy. We also discover best practices that hold across multiple datasets. The code is available at https://github.com/gmberton/image-retrieval

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes