GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild
This work addresses the problem of detailed object understanding for computer vision applications, but it appears incremental as it builds on existing datasets and tasks.
The paper tackles the challenging task of part semantic segmentation in the wild by proposing a framework that combines object-level context conditioning and part-level spatial relationships, achieving state-of-the-art results on the Pascal-Part dataset.
The semantic segmentation of parts of objects in the wild is a challenging task in which multiple instances of objects and multiple parts within those objects must be detected in the scene. This problem remains nowadays very marginally explored, despite its fundamental importance towards detailed object understanding. In this work, we propose a novel framework combining higher object-level context conditioning and part-level spatial relationships to address the task. To tackle object-level ambiguity, a class-conditioning module is introduced to retain class-level semantics when learning parts-level semantics. In this way, mid-level features carry also this information prior to the decoding stage. To tackle part-level ambiguity and localization we propose a novel adjacency graph-based module that aims at matching the relative spatial relationships between ground truth and predicted parts. The experimental evaluation on the Pascal-Part dataset shows that we achieve state-of-the-art results on this task.