MLLGDec 11, 2017

On Quadratic Penalties in Elastic Weight Consolidation

arXiv:1712.03847v1127 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental analysis that identifies a potential flaw in a method for continual learning, relevant for researchers in machine learning.

The paper tackles the issue of catastrophic forgetting in neural networks by analyzing Elastic Weight Consolidation (EWC), showing that its quadratic penalties are inconsistent with a derivation for multiple tasks and may cause double-counting of earlier task data.

Elastic weight consolidation (EWC, Kirkpatrick et al, 2017) is a novel algorithm designed to safeguard against catastrophic forgetting in neural networks. EWC can be seen as an approximation to Laplace propagation (Eskin et al, 2004), and this view is consistent with the motivation given by Kirkpatrick et al (2017). In this note, I present an extended derivation that covers the case when there are more than two tasks. I show that the quadratic penalties in EWC are inconsistent with this derivation and might lead to double-counting data from earlier tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes