CVSep 19, 2025

Towards Robust Visual Continual Learning with Multi-Prototype Supervision

arXiv:2509.16011v11 citationsh-index: 39
Originality Incremental advance
AI Analysis

This work addresses limitations in visual continual learning for AI systems, though it is incremental as it builds on existing language-guided supervision methods.

The paper tackled the problem of semantic ambiguity and intra-class visual diversity in language-guided visual continual learning by proposing MuproCL, a framework that uses multiple context-aware prototypes instead of a single target, resulting in enhanced performance and robustness across various continual learning baselines.

Language-guided supervision, which utilizes a frozen semantic target from a Pretrained Language Model (PLM), has emerged as a promising paradigm for visual Continual Learning (CL). However, relying on a single target introduces two critical limitations: 1) semantic ambiguity, where a polysemous category name results in conflicting visual representations, and 2) intra-class visual diversity, where a single prototype fails to capture the rich variety of visual appearances within a class. To this end, we propose MuproCL, a novel framework that replaces the single target with multiple, context-aware prototypes. Specifically, we employ a lightweight LLM agent to perform category disambiguation and visual-modal expansion to generate a robust set of semantic prototypes. A LogSumExp aggregation mechanism allows the vision model to adaptively align with the most relevant prototype for a given image. Extensive experiments across various CL baselines demonstrate that MuproCL consistently enhances performance and robustness, establishing a more effective path for language-guided continual learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes