CVSep 8, 2025

Back To The Drawing Board: Rethinking Scene-Level Sketch-Based Image Retrieval

arXiv:2509.06566v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work improves cross-modal retrieval for users handling noisy sketches, but it is incremental as it focuses on training design rather than new paradigms.

The paper tackled the problem of scene-level sketch-based image retrieval by addressing the inherent ambiguity and noise in real-world sketches, achieving state-of-the-art performance on FS-COCO and SketchyCOCO datasets.

The goal of Scene-level Sketch-Based Image Retrieval is to retrieve natural images matching the overall semantics and spatial layout of a free-hand sketch. Unlike prior work focused on architectural augmentations of retrieval models, we emphasize the inherent ambiguity and noise present in real-world sketches. This insight motivates a training objective that is explicitly designed to be robust to sketch variability. We show that with an appropriate combination of pre-training, encoder architecture, and loss formulation, it is possible to achieve state-of-the-art performance without the introduction of additional complexity. Extensive experiments on a challenging FS-COCO and widely-used SketchyCOCO datasets confirm the effectiveness of our approach and underline the critical role of training design in cross-modal retrieval tasks, as well as the need to improve the evaluation scenarios of scene-level SBIR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes