LGOct 27, 2024

Plastic Learning with Deep Fourier Features

arXiv:2410.20634v115 citationsh-index: 22ICLR
Originality Highly original
AI Analysis

This addresses the challenge of maintaining trainability in non-stationary environments for continual learning systems, representing a novel method for a known bottleneck.

The paper tackles the problem of loss of plasticity in deep neural networks during continual learning by proposing deep Fourier features, which combine sine and cosine activations to balance linearity and nonlinearity, resulting in drastically improved performance across various scenarios and datasets like CIFAR10 and CIFAR100.

Deep neural networks can struggle to learn continually in the face of non-stationarity. This phenomenon is known as loss of plasticity. In this paper, we identify underlying principles that lead to plastic algorithms. In particular, we provide theoretical results showing that linear function approximation, as well as a special case of deep linear networks, do not suffer from loss of plasticity. We then propose deep Fourier features, which are the concatenation of a sine and cosine in every layer, and we show that this combination provides a dynamic balance between the trainability obtained through linearity and the effectiveness obtained through the nonlinearity of neural networks. Deep networks composed entirely of deep Fourier features are highly trainable and sustain their trainability over the course of learning. Our empirical results show that continual learning performance can be drastically improved by replacing ReLU activations with deep Fourier features. These results hold for different continual learning scenarios (e.g., label noise, class incremental learning, pixel permutations) on all major supervised learning datasets used for continual learning research, such as CIFAR10, CIFAR100, and tiny-ImageNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes