LGRTJul 6, 2021

Universal approximation and model compression for radial neural networks

arXiv:2107.02550v310 citations
AI Analysis

This provides a theoretical foundation and practical compression method for a new class of neural networks, which could benefit researchers and practitioners in machine learning, though it appears incremental by extending prior work on rotation equivariant networks.

The paper introduces radial neural networks, which use activation functions that rescale feature vectors based on their norm rather than pointwise operations, and proves universal approximation theorems for them even with bounded widths and unbounded domains. It also develops a lossless model compression algorithm by exploiting orthogonal symmetries in the parameter space.

We introduce a class of fully-connected neural networks whose activation functions, rather than being pointwise, rescale feature vectors by a function depending only on their norm. We call such networks radial neural networks, extending previous work on rotation equivariant networks that considers rescaling activations in less generality. We prove universal approximation theorems for radial neural networks, including in the more difficult cases of bounded widths and unbounded domains. Our proof techniques are novel, distinct from those in the pointwise case. Additionally, radial neural networks exhibit a rich group of orthogonal change-of-basis symmetries on the vector space of trainable parameters. Factoring out these symmetries leads to a practical lossless model compression algorithm. Optimization of the compressed model by gradient descent is equivalent to projected gradient descent for the full model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes