CVJan 17, 2023

Distribution Aligned Feature Clustering for Zero-Shot Sketch-Based Image Retrieval

arXiv:2301.06685v12 citationsh-index: 14
Originality Highly original
AI Analysis

This work addresses the problem of cross-modal retrieval for sketch-based image search in zero-shot settings, offering a novel approach with substantial performance gains.

The paper tackles the challenges of zero-shot sketch-based image retrieval by proposing a method that clusters gallery images and uses centroids as proxies, along with a distribution alignment loss to reduce domain gaps. It achieves significant improvements, such as up to 31% and 39% relative gains in mAP@all on Sketchy and TU-Berlin datasets.

Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a challenging cross-modal retrieval task. In prior arts, the retrieval is conducted by sorting the distance between the query sketch and each image in the gallery. However, the domain gap and the zero-shot setting make neural networks hard to generalize. This paper tackles the challenges from a new perspective: utilizing gallery image features. We propose a Cluster-then-Retrieve (ClusterRetri) method that performs clustering on the gallery images and uses the cluster centroids as proxies for retrieval. Furthermore, a distribution alignment loss is proposed to align the image and sketch features with a common Gaussian distribution, reducing the domain gap. Despite its simplicity, our proposed method outperforms the state-of-the-art methods by a large margin on popular datasets, e.g., up to 31% and 39% relative improvement of mAP@all on the Sketchy and TU-Berlin datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes