Weakly Supervised Object Localization as Domain Adaption
This work addresses the challenge of localizing objects with only image-level labels, which is important for reducing annotation costs in computer vision, and it is incremental by applying domain adaptation to an existing WSOL framework.
The paper tackles the problem of weakly supervised object localization (WSOL) by modeling it as a domain adaptation task, resulting in a pipeline that outperforms state-of-the-art methods on multiple benchmarks.
Weakly supervised object localization (WSOL) focuses on localizing objects only with the supervision of image-level classification masks. Most previous WSOL methods follow the classification activation map (CAM) that localizes objects based on the classification structure with the multi-instance learning (MIL) mechanism. However, the MIL mechanism makes CAM only activate discriminative object parts rather than the whole object, weakening its performance for localizing objects. To avoid this problem, this work provides a novel perspective that models WSOL as a domain adaption (DA) task, where the score estimator trained on the source/image domain is tested on the target/pixel domain to locate objects. Under this perspective, a DA-WSOL pipeline is designed to better engage DA approaches into WSOL to enhance localization performance. It utilizes a proposed target sampling strategy to select different types of target samples. Based on these types of target samples, domain adaption localization (DAL) loss is elaborated. It aligns the feature distribution between the two domains by DA and makes the estimator perceive target domain cues by Universum regularization. Experiments show that our pipeline outperforms SOTA methods on multi benchmarks. Code are released at \url{https://github.com/zh460045050/DA-WSOL_CVPR2022}.