Bayesian continual learning and forgetting in neural networks
This addresses the problem of continual learning in AI by providing a biologically-inspired method that balances memory retention and flexibility, though it is incremental as it builds on existing Bayesian and regularization techniques.
The paper tackles catastrophic forgetting and remembering in neural networks by introducing MESU, a Bayesian framework that updates parameters based on uncertainty, enabling principled learning and forgetting. Experiments on permuted MNIST and CIFAR-100 show MESU outperforms existing methods in accuracy, task learning capability, and out-of-distribution detection, such as achieving better performance on 200 sequential tasks.
Biological synapses effortlessly balance memory retention and flexibility, yet artificial neural networks still struggle with the extremes of catastrophic forgetting and catastrophic remembering. Here, we introduce Metaplasticity from Synaptic Uncertainty (MESU), a Bayesian framework that updates network parameters according their uncertainty. This approach allows a principled combination of learning and forgetting that ensures that critical knowledge is preserved while unused or outdated information is gradually released. Unlike standard Bayesian approaches -- which risk becoming overly constrained, and popular continual-learning methods that rely on explicit task boundaries, MESU seamlessly adapts to streaming data. It further provides reliable epistemic uncertainty estimates, allowing out-of-distribution detection, the only computational cost being to sample the weights multiple times to provide proper output statistics. Experiments on image-classification benchmarks demonstrate that MESU mitigates catastrophic forgetting, while maintaining plasticity for new tasks. When training 200 sequential permuted MNIST tasks, MESU outperforms established continual learning techniques in terms of accuracy, capability to learn additional tasks, and out-of-distribution data detection. Additionally, due to its non-reliance on task boundaries, MESU outperforms conventional learning techniques on the incremental training of CIFAR-100 tasks consistently in a wide range of scenarios. Our results unify ideas from metaplasticity, Bayesian inference, and Hessian-based regularization, offering a biologically-inspired pathway to robust, perpetual learning.