CV MLSep 18, 2019

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory Slabaugh, Tinne Tuytelaars

arXiv:1909.08383v350.9421 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of enabling neural networks to accumulate knowledge over sequential tasks without retraining, which is crucial for real-world AI applications, though it is incremental as it builds on existing continual learning research.

The paper tackles the problem of catastrophic forgetting in neural networks by surveying continual learning methods for classification tasks, and introduces a novel framework to balance stability and plasticity while empirically comparing 11 state-of-the-art methods on benchmarks like Tiny Imagenet and iNaturalist.

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.

View on arXiv PDF

Similar