Gated Linear Networks
This work addresses the issue of catastrophic forgetting for machine learning practitioners, offering a complementary technique to offline deep learning methods, though it is incremental in its approach.
The paper tackles the problem of catastrophic forgetting and slow online learning in neural networks by introducing Gated Linear Networks (GLNs), a backpropagation-free architecture that achieves comparable performance to MLPs with dropout and Elastic Weight Consolidation on standard benchmarks.
This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target, forgoing the ability to learn feature representations in favor of rapid online learning. Individual neurons can model nonlinear functions via the use of data-dependent gating in conjunction with online convex optimization. We show that this architecture gives rise to universal learning capabilities in the limit, with effective model capacity increasing as a function of network size in a manner comparable with deep ReLU networks. Furthermore, we demonstrate that the GLN learning mechanism possesses extraordinary resilience to catastrophic forgetting, performing comparably to a MLP with dropout and Elastic Weight Consolidation on standard benchmarks. These desirable theoretical and empirical properties position GLNs as a complementary technique to contemporary offline deep learning methods.