LGAINEMay 22

On the Infinite Width and Depth Limits of Predictive Coding Networks

arXiv:2602.0769747.5h-index: 7
Predicted impact top 53% in LG · last 90 daysOriginality Incremental advance
AI Analysis

This work provides theoretical grounding for scaling predictive coding networks, a biologically plausible alternative to backpropagation, by identifying parameterizations that ensure stability and convergence to BP in wide networks.

This paper studies the infinite width and depth limits of predictive coding networks (PCNs), showing that for linear residual networks, the set of width- and depth-stable feature-learning parameterizations is identical to that of backpropagation (BP). Under these parameterizations, the PC energy converges to the quadratic BP loss when width >> depth, leading to PC computing the same gradients as BP, and experiments confirm this convergence for nonlinear models like convolutional networks and transformers.

Predictive coding (PC) is a biologically plausible alternative to standard backpropagation (BP) that minimises an energy function with respect to network activities before updating weights. Recent work has improved the training stability of deep PC networks (PCNs) by leveraging some BP-inspired reparameterisations. However, the full scalability and theoretical basis of these methods remain unclear. To address this gap, we study the infinite width and depth limits of PCNs. For linear residual networks, we show that the set of width- and depth-stable feature-learning parameterisations for PC is exactly the same as for BP. Moreover, under any of these parameterisations, the PC energy with equilibrated activities converges to the quadratic BP loss when the model width is much larger than the depth, resulting in PC computing the same gradients as BP. Experiments show that, as long as an activity equilibrium is reached, convergence to BP holds for nonlinear models including convolutional networks and transformers. Overall, this work constrains the types of parameterisation that are scalable with PC, while showing a way in which BP can be effectively implemented with only local updates in much wider than deep networks like the brain.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes