CVApr 1, 2022

Bridging the Gap between Classification and Localization for Weakly Supervised Object Localization

arXiv:2204.00220v152 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately localizing entire objects with only image-level labels, which is crucial for applications like image analysis and annotation, representing an incremental improvement over prior CAM-based methods.

The paper tackles the problem of weakly supervised object localization, where existing methods using class activation maps (CAM) only identify the most discriminative parts of objects, by proposing a method to align feature directions with class-specific weights, achieving state-of-the-art performance on CUB-200-2011 and ImageNet-1K benchmarks.

Weakly supervised object localization aims to find a target object region in a given image with only weak supervision, such as image-level labels. Most existing methods use a class activation map (CAM) to generate a localization map; however, a CAM identifies only the most discriminative parts of a target object rather than the entire object region. In this work, we find the gap between classification and localization in terms of the misalignment of the directions between an input feature and a class-specific weight. We demonstrate that the misalignment suppresses the activation of CAM in areas that are less discriminative but belong to the target object. To bridge the gap, we propose a method to align feature directions with a class-specific weight. The proposed method achieves a state-of-the-art localization performance on the CUB-200-2011 and ImageNet-1K benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes