CVMay 22, 2025

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching

arXiv:2505.16778v15 citationsh-index: 19CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of generalizing few-shot counting to unseen domains, which is an incremental advance in the field.

The paper tackles the problem of domain shift in few-shot counting by proposing a single domain generalization model that uses universal vision-language representations to improve robustness. The model achieves state-of-the-art performance in both in-domain and domain generalization settings.

Few-shot counting estimates the number of target objects in an image using only a few annotated exemplars. However, domain shift severely hinders existing methods to generalize to unseen scenarios. This falls into the realm of single domain generalization that remains unexplored in few-shot counting. To solve this problem, we begin by analyzing the main limitations of current methods, which typically follow a standard pipeline that extract the object prototypes from exemplars and then match them with image feature to construct the correlation map. We argue that existing methods overlook the significance of learning highly generalized prototypes. Building on this insight, we propose the first single domain generalization few-shot counting model, Universal Representation Matching, termed URM. Our primary contribution is the discovery that incorporating universal vision-language representations distilled from a large scale pretrained vision-language model into the correlation construction process substantially improves robustness to domain shifts without compromising in domain performance. As a result, URM achieves state-of-the-art performance on both in domain and the newly introduced domain generalization setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes