CVAILGJan 3, 2024

Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographical Robustness in Object Recognition

arXiv:2401.01482v26 citationsh-index: 18CVPR
AI Analysis

This addresses geographical domain shift in object recognition for applications requiring global deployment, representing an incremental improvement with specific gains.

The paper tackles the problem of object recognition models lacking robustness across different geographical regions by incorporating geographically diverse knowledge into CLIP prompting, achieving accuracy gains of up to +2.8/1.2/1.6 on target data from Africa/Asia/Americas and +4.6 overall on the hardest classes.

Existing object recognition models have been shown to lack robustness in diverse geographical scenarios due to domain shifts in design and context. Class representations need to be adapted to more accurately reflect an object concept under these shifts. In the absence of training data from target geographies, we hypothesize that geographically diverse descriptive knowledge of categories can enhance robustness. For this purpose, we explore the feasibility of probing a large language model for geography-based object knowledge, and we examine the effects of integrating knowledge into zero-shot and learnable soft prompting with CLIP. Within this exploration, we propose geography knowledge regularization to ensure that soft prompts trained on a source set of geographies generalize to an unseen target set. Accuracy gains over prompting baselines on DollarStreet while training only on Europe data are up to +2.8/1.2/1.6 on target data from Africa/Asia/Americas, and +4.6 overall on the hardest classes. Competitive performance is shown vs. few-shot target training, and analysis is provided to direct future study of geographical robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes