LG NE BIO-PHSep 12, 2023

Neural Network Layer Matrix Decomposition reveals Latent Manifold Encoding and Memory Capacity

arXiv:2309.05968v12.0h-index: 35

Originality Highly original

AI Analysis

This work provides foundational insights into how neural networks encode data and break the curse of dimensionality, with implications for understanding memory capacity and expressivity in models like Hopfield networks and Transformers.

The authors prove a neural network encoding theorem showing that weight matrices of converged networks encode continuous functions approximating training data within a finite error margin, and use singular value decomposition to reveal latent manifold structures and geometric operations in layers.

We prove the converse of the universal approximation theorem, i.e. a neural network (NN) encoding theorem which shows that for every stably converged NN of continuous activation functions, its weight matrix actually encodes a continuous function that approximates its training dataset to within a finite margin of error over a bounded domain. We further show that using the Eckart-Young theorem for truncated singular value decomposition of the weight matrix for every NN layer, we can illuminate the nature of the latent space manifold of the training dataset encoded and represented by every NN layer, and the geometric nature of the mathematical operations performed by each NN layer. Our results have implications for understanding how NNs break the curse of dimensionality by harnessing memory capacity for expressivity, and that the two are complementary. This Layer Matrix Decomposition (LMD) further suggests a close relationship between eigen-decomposition of NN layers and the latest advances in conceptualizations of Hopfield networks and Transformer NN models.

View on arXiv PDF

Similar