LG AIFeb 7, 2024

Universal Neural Functionals

Allan Zhou, Chelsea Finn, James Harrison

arXiv:2402.05232v123.124 citationsh-index: 65Has CodeNIPS

Originality Incremental advance

AI Analysis

This work addresses a domain-specific challenge in machine learning by enabling more effective weight-space modeling for complex architectures, though it is incremental in building on prior methods.

The paper tackles the problem of processing weight-space features for general neural network architectures by proposing universal neural functionals (UNFs) that automatically construct permutation equivariant models, demonstrating improvements in learned optimizers for small image classifiers and language models.

A challenging problem in many modern machine learning tasks is to process weight-space features, i.e., to transform or extract information from the weights and gradients of a neural network. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. However, they are not applicable to general architectures, since the permutation symmetries of a weight space can be complicated by recurrence or residual connections. This work proposes an algorithm that automatically constructs permutation equivariant models, which we refer to as universal neural functionals (UNFs), for any weight space. Among other applications, we demonstrate how UNFs can be substituted into existing learned optimizer designs, and find promising improvements over prior methods when optimizing small image classifiers and language models. Our results suggest that learned optimizers can benefit from considering the (symmetry) structure of the weight space they optimize. We open-source our library for constructing UNFs at https://github.com/AllanYangZhou/universal_neural_functional.

View on arXiv PDF Code

Similar