LGJan 27

Critical Organization of Deep Neural Networks, and p-Adic Statistical Field Theories

arXiv:2601.19070v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work provides a theoretical framework for understanding critical behaviors in neural networks, which could impact researchers in machine learning theory and statistical physics, but it appears incremental as it builds on existing thermodynamic and p-adic concepts.

The authors tackled the thermodynamic limit of deep and recurrent neural networks with sigmoid activations, showing a unique state in a parameter region that bifurcates into infinitely many states outside it, and connected hierarchical topologies to p-adic structures, with a toy model exhibiting a strange attractor. They also analyzed random versions of these networks, deriving a power-type expansion for the output distribution in the infinite-width case, where the constant term is Gaussian.

We rigorously study the thermodynamic limit of deep neural networks (DNNS) and recurrent neural networks (RNNs), assuming that the activation functions are sigmoids. A thermodynamic limit is a continuous neural network, where the neurons form a continuous space with infinitely many points. We show that such a network admits a unique state in a certain region of the parameter space, which depends continuously on the parameters. This state breaks into an infinite number of states outside the mentioned region of parameter space. Then, the critical organization is a bifurcation in the parameter space, where a network transitions from a unique state to infinitely many states. We use p-adic integers to codify hierarchical structures. Indeed, we present an algorithm that recasts the hierarchical topologies used in DNNs and RNNs as p-adic tree-like structures. In this framework, the hierarchical and the critical organizations are connected. We study rigorously the critical organization of a toy model, a hierarchical edge detector for grayscale images based on p-adic cellular neural networks. The critical organization of such a network can be described as a strange attractor. In the second part, we study random versions of DNNs and RNNs. In this case, the network parameters are generalized Gaussian random variables in a space of quadratic integrable functions. We compute the probability distribution of the output given the input, in the infinite-width case. We show that it admits a power-type expansion, where the constant term is a Gaussian distribution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes