CVMay 30, 2025

Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors

arXiv:2505.24103v19 citationsh-index: 3Has CodeICLR
Originality Incremental advance
AI Analysis

This addresses the problem of locating actions and functions on objects for robotics and human-computer interaction, representing an incremental advance by enhancing baseline models with semantic priors.

The paper tackles weakly supervised affordance grounding by training a model to identify affordance regions on objects using human-object interaction and egocentric images without dense labels, achieving a breakthrough improvement over existing methods.

In this work, we focus on the task of weakly supervised affordance grounding, where a model is trained to identify affordance regions on objects using human-object interaction images and egocentric object images without dense labels. Previous works are mostly built upon class activation maps, which are effective for semantic segmentation but may not be suitable for locating actions and functions. Leveraging recent advanced foundation models, we develop a supervised training pipeline based on pseudo labels. The pseudo labels are generated from an off-the-shelf part segmentation model, guided by a mapping from affordance to part names. Furthermore, we introduce three key enhancements to the baseline model: a label refining stage, a fine-grained feature alignment process, and a lightweight reasoning module. These techniques harness the semantic knowledge of static objects embedded in off-the-shelf foundation models to improve affordance learning, effectively bridging the gap between objects and actions. Extensive experiments demonstrate that the performance of the proposed model has achieved a breakthrough improvement over existing methods. Our codes are available at https://github.com/woyut/WSAG-PLSP .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes