CVJun 15, 2021

Compositional Sketch Search

arXiv:2106.08009v1
Originality Highly original
AI Analysis

This work addresses the limitation of existing sketch-based image retrieval methods, which typically handle only single objects, by enabling compositional queries for users in image search applications.

The paper tackles the problem of searching image collections using free-hand sketches that describe multiple objects and their spatial relationships, achieving a method that encodes sketched objects and their compositions into a metric search embedding for efficient visual search.

We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects. Sketch based image retrieval (SBIR) methods predominantly match queries containing a single, dominant object invariant to its position within an image. Our work exploits drawings as a concise and intuitive representation for specifying entire scene compositions. We train a convolutional neural network (CNN) to encode masked visual features from sketched objects, pooling these into a spatial descriptor encoding the spatial relationships and appearances of objects in the composition. Training the CNN backbone as a Siamese network under triplet loss yields a metric search embedding for measuring compositional similarity which may be efficiently leveraged for visual search by applying product quantization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes