IPSeg: Image Posterior Mitigates Semantic Drift in Class-Incremental Segmentation
This work addresses the problem of semantic drift in class-incremental semantic segmentation, which is significant for researchers and practitioners dealing with non-stationary data streams in computer vision applications, and presents an incremental improvement over existing methods.
The authors tackled the problem of semantic drift in class-incremental semantic segmentation, achieving superior performance compared to state-of-the-art methods, particularly in long-term incremental scenarios. Their approach, IPSeg, resulted in improved performance on the Pascal VOC 2012 and ADE20K datasets.
Class incremental learning aims to enable models to learn from sequential, non-stationary data streams across different tasks without catastrophic forgetting. In class incremental semantic segmentation (CISS), the semantic content of image pixels evolves over incremental phases, known as semantic drift. In this work, we identify two critical challenges in CISS that contribute to semantic drift and degrade performance. First, we highlight the issue of separate optimization, where different parts of the model are optimized in distinct incremental stages, leading to misaligned probability scales. Second, we identify noisy semantics arising from inappropriate pseudo-labeling, which results in sub-optimal results. To address these challenges, we propose a novel and effective approach, Image Posterior and Semantics Decoupling for Segmentation (IPSeg). IPSeg introduces two key mechanisms: (1) leveraging image posterior probabilities to align optimization across stages and mitigate the effects of separate optimization, and (2) employing semantics decoupling to handle noisy semantics and tailor learning strategies for different semantics. Extensive experiments on the Pascal VOC 2012 and ADE20K datasets demonstrate that IPSeg achieves superior performance compared to state-of-the-art methods, particularly in challenging long-term incremental scenarios.