LGAINEMLAug 21, 2024

Only Strict Saddles in the Energy Landscape of Predictive Coding Networks?

arXiv:2408.11979v28 citationsh-index: 6
AI Analysis

This provides theoretical insight into the learning dynamics of PC networks, which could benefit researchers in machine learning and neuroscience, though it is incremental in advancing understanding of PC's advantages and limitations.

The paper tackled the problem of understanding why predictive coding (PC) networks sometimes converge faster than backpropagation by analyzing the geometry of their energy landscape, showing that many degenerate saddles in the loss become easier to escape in PC, making the landscape more benign and robust to vanishing gradients.

Predictive coding (PC) is an energy-based learning algorithm that performs iterative inference over network activities before updating weights. Recent work suggests that PC can converge in fewer learning steps than backpropagation thanks to its inference procedure. However, these advantages are not always observed, and the impact of PC inference on learning is not theoretically well understood. Here, we study the geometry of the PC energy landscape at the inference equilibrium of the network activities. For deep linear networks, we first show that the equilibrated energy is simply a rescaled mean squared error loss with a weight-dependent rescaling. We then prove that many highly degenerate (non-strict) saddles of the loss including the origin become much easier to escape (strict) in the equilibrated energy. Our theory is validated by experiments on both linear and non-linear networks. Based on these and other results, we conjecture that all the saddles of the equilibrated energy are strict. Overall, this work suggests that PC inference makes the loss landscape more benign and robust to vanishing gradients, while also highlighting the fundamental challenge of scaling PC to deeper models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes