Meta-Consolidation for Continual Learning
This addresses the challenge of enabling deep learning systems to learn continuously without forgetting previous tasks, which is incremental as it builds on existing continual learning approaches.
The paper tackles the problem of catastrophic forgetting in continual learning by introducing MERLIN, a meta-consolidation method that learns a meta-distribution of neural network weights for tasks, showing consistent improvements over five baselines including a state-of-the-art method on benchmarks like MNIST, CIFAR-10, CIFAR-100, and Mini-ImageNet.
The ability to continuously learn and adapt itself to new tasks, without losing grasp of already acquired knowledge is a hallmark of biological learning systems, which current deep learning systems fall short of. In this work, we present a novel methodology for continual learning called MERLIN: Meta-Consolidation for Continual Learning. We assume that weights of a neural network $\boldsymbol ψ$, for solving task $\boldsymbol t$, come from a meta-distribution $p(\boldsymbol{ψ|t})$. This meta-distribution is learned and consolidated incrementally. We operate in the challenging online continual learning setting, where a data point is seen by the model only once. Our experiments with continual learning benchmarks of MNIST, CIFAR-10, CIFAR-100 and Mini-ImageNet datasets show consistent improvement over five baselines, including a recent state-of-the-art, corroborating the promise of MERLIN.