LGOct 27, 2024

Plastic Learning with Deep Fourier Features

Alex Lewandowski, Dale Schuurmans, Marlos C. Machado

arXiv:2410.20634v117.615 citationsh-index: 22ICLR

Originality Highly original

AI Analysis

This addresses the challenge of maintaining trainability in non-stationary environments for continual learning systems, representing a novel method for a known bottleneck.

The paper tackles the problem of loss of plasticity in deep neural networks during continual learning by proposing deep Fourier features, which combine sine and cosine activations to balance linearity and nonlinearity, resulting in drastically improved performance across various scenarios and datasets like CIFAR10 and CIFAR100.

Deep neural networks can struggle to learn continually in the face of non-stationarity. This phenomenon is known as loss of plasticity. In this paper, we identify underlying principles that lead to plastic algorithms. In particular, we provide theoretical results showing that linear function approximation, as well as a special case of deep linear networks, do not suffer from loss of plasticity. We then propose deep Fourier features, which are the concatenation of a sine and cosine in every layer, and we show that this combination provides a dynamic balance between the trainability obtained through linearity and the effectiveness obtained through the nonlinearity of neural networks. Deep networks composed entirely of deep Fourier features are highly trainable and sustain their trainability over the course of learning. Our empirical results show that continual learning performance can be drastically improved by replacing ReLU activations with deep Fourier features. These results hold for different continual learning scenarios (e.g., label noise, class incremental learning, pixel permutations) on all major supervised learning datasets used for continual learning research, such as CIFAR10, CIFAR100, and tiny-ImageNet.

View on arXiv PDF

Similar