LGMLMar 10, 2025

Sequential Function-Space Variational Inference via Gaussian Mixture Approximation

arXiv:2503.07114v2h-index: 29
Originality Incremental advance
AI Analysis

This work addresses catastrophic forgetting in neural networks for continual learning, offering an incremental improvement over existing variational inference methods.

The paper tackles the problem of catastrophic forgetting in continual learning by proposing a sequential function-space variational inference method using a Gaussian mixture distribution to better approximate the multi-modal posterior of neural networks. The result shows that this likelihood-focused Gaussian mixture approach outperforms other sequential variational inference methods, particularly when continual learning is applied to all layers rather than just the final layer.

Continual learning in neural networks aims to learn new tasks without forgetting old tasks. Sequential function-space variational inference (SFSVI) uses a Gaussian variational distribution to approximate the distribution of the outputs of the neural network corresponding to a finite number of selected inducing points. Since the posterior distribution of a neural network is multi-modal, a Gaussian distribution could only match one mode of the posterior distribution, and a Gaussian mixture distribution could be used to better approximate the posterior distribution. We propose an SFSVI method based on a Gaussian mixture variational distribution. We also compare different types of variational inference methods with a fixed pre-trained feature extractor (where continual learning is performed on the final layer) and without a fixed pre-trained feature extractor (where continual learning is performed on all layers). We find that in terms of final average accuracy, likelihood-focused Gaussian mixture SFSVI outperforms other sequential variational inference methods, especially in the latter case.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes