Early Inference in Energy-Based Models Approximates Back-Propagation
This work provides a theoretical framework for explaining efficient credit assignment in biological neural hierarchies, potentially bridging machine learning and neuroscience.
The paper demonstrates that early steps of Langevin MCMC inference in an energy-based model with latent variables approximate back-propagation by propagating error gradients into internal layers, with back-propagated errors corresponding to temporal derivatives of hidden unit activations.
We show that Langevin MCMC inference in an energy-based model with latent variables has the property that the early steps of inference, starting from a stationary point, correspond to propagating error gradients into internal layers, similarly to back-propagation. The error that is back-propagated is with respect to visible units that have received an outside driving force pushing them away from the stationary point. Back-propagated error gradients correspond to temporal derivatives of the activation of hidden units. This observation could be an element of a theory for explaining how brains perform credit assignment in deep hierarchies as efficiently as back-propagation does. In this theory, the continuous-valued latent variables correspond to averaged voltage potential (across time, spikes, and possibly neurons in the same minicolumn), and neural computation corresponds to approximate inference and error back-propagation at the same time.