Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams
This addresses the challenge of continual learning from streaming data for AI systems that need to adapt over time without forgetting, though it appears incremental as it builds on existing prototype and continual learning methods.
The paper tackles the problem of learning prototypes online from non-stationary data streams, which causes outdated prototypes and catastrophic forgetting, by introducing a system that evolves prototypes continually in a shared latent space with an efficient memory scheme and novel objective function, achieving state-of-the-art performance on eight benchmarks, including three highly imbalanced streams.
Attaining prototypical features to represent class distributions is well established in representation learning. However, learning prototypes online from streaming data proves a challenging endeavor as they rapidly become outdated, caused by an ever-changing parameter space during the learning process. Additionally, continual learning does not assume the data stream to be stationary, typically resulting in catastrophic forgetting of previous knowledge. As a first, we introduce a system addressing both problems, where prototypes evolve continually in a shared latent space, enabling learning and prediction at any point in time. In contrast to the major body of work in continual learning, data streams are processed in an online fashion, without additional task-information, and an efficient memory scheme provides robustness to imbalanced data streams. Besides nearest neighbor based prediction, learning is facilitated by a novel objective function, encouraging cluster density about the class prototype and increased inter-class variance. Furthermore, the latent space quality is elevated by pseudo-prototypes in each batch, constituted by replay of exemplars from memory. As an additional contribution, we generalize the existing paradigms in continual learning to incorporate data incremental learning from data streams by formalizing a two-agent learner-evaluator framework. We obtain state-of-the-art performance by a significant margin on eight benchmarks, including three highly imbalanced data streams.