CVNov 12, 2025

Composition-Incremental Learning for Compositional Generalization

arXiv:2511.09082v1h-index: 9
Originality Incremental advance
AI Analysis

This work addresses the challenge of incremental learning for compositional generalization in computer vision, which is incremental as it builds on existing CZSL tasks with new benchmarks and methods.

The paper tackles the problem of compositional generalization in computer vision for incremental learning scenarios, where models need to continuously learn new compositions from emerging data, and proposes a pseudo-replay framework that achieves effective performance on benchmarks like MIT-States-CompIL and C-GQA-CompIL.

Compositional generalization has achieved substantial progress in computer vision on pre-collected training data. Nonetheless, real-world data continually emerges, with possible compositions being nearly infinite, long-tailed, and not entirely visible. Thus, an ideal model is supposed to gradually improve the capability of compositional generalization in an incremental manner. In this paper, we explore Composition-Incremental Learning for Compositional Generalization (CompIL) in the context of the compositional zero-shot learning (CZSL) task, where models need to continually learn new compositions, intending to improve their compositional generalization capability progressively. To quantitatively evaluate CompIL, we develop a benchmark construction pipeline leveraging existing datasets, yielding MIT-States-CompIL and C-GQA-CompIL. Furthermore, we propose a pseudo-replay framework utilizing a visual synthesizer to synthesize visual representations of learned compositions and a linguistic primitive distillation mechanism to maintain aligned primitive representations across the learning process. Extensive experiments demonstrate the effectiveness of the proposed framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes