CVMar 10, 2025

Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization

arXiv:2503.07038v21 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses the practical challenge of retrieving images with specific small objects in cluttered scenes, which is incremental as it builds on existing retrieval methods with a novel framework.

The paper tackles the problem of Small Object Image Retrieval (SoIR) by introducing Multi-object Attention Optimization (MaO), a framework that uses multi-object pre-training and attention-based feature extraction to create unified image descriptors, achieving significant improvements over existing methods in zero-shot and fine-tuning scenarios.

We address the challenge of Small Object Image Retrieval (SoIR), where the goal is to retrieve images containing a specific small object, in a cluttered scene. The key challenge in this setting is constructing a single image descriptor, for scalable and efficient search, that effectively represents all objects in the image. In this paper, we first analyze the limitations of existing methods on this challenging task and then introduce new benchmarks to support SoIR evaluation. Next, we introduce Multi-object Attention Optimization (MaO), a novel retrieval framework which incorporates a dedicated multi-object pre-training phase. This is followed by a refinement process that leverages attention-based feature extraction with object masks, integrating them into a single unified image descriptor. Our MaO approach significantly outperforms existing retrieval methods and strong baselines, achieving notable improvements in both zero-shot and lightweight multi-object fine-tuning. We hope this work will lay the groundwork and inspire further research to enhance retrieval performance for this highly practical task. Code and Data are available on our project page: $\href{https://pihash2k.github.io/findyourneedle.github.io}{https://pihash2k.github.io/findyourneedle.github.io}$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes