NC AI LG NE SPOct 27, 2021

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons

Paul Haider, Benjamin Ellenberger, Laura Kriener, Jakob Jordan, Walter Senn, Mihai A. Petrovici

arXiv:2110.14549v18.633 citations

Originality Highly original

AI Analysis

This addresses a fundamental bottleneck in biologically plausible deep learning models for neuroscience and neuromorphic computing, offering a novel solution to timing issues in physical systems.

The paper tackles the problem of slow neuron response times causing delays and timing mismatches in hierarchical cortical networks, which hinder both inference and learning. It introduces the Latent Equilibrium framework, which enables quasi-instantaneous inference independent of network depth and achieves competitive performance on standard benchmark datasets using fully-connected and convolutional architectures.

The response time of physical computational elements is finite, and neurons are no exception. In hierarchical models of cortical networks each layer thus introduces a response lag. This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference, but also learning. We introduce Latent Equilibrium, a new framework for inference and learning in networks of slow components which avoids these issues by harnessing the ability of biological neurons to phase-advance their output with respect to their membrane potential. This principle enables quasi-instantaneous inference independent of network depth and avoids the need for phased plasticity or computationally expensive network relaxation phases. We jointly derive disentangled neuron and synapse dynamics from a prospective energy function that depends on a network's generalized position and momentum. The resulting model can be interpreted as a biologically plausible approximation of error backpropagation in deep cortical networks with continuous-time, leaky neuronal dynamics and continuously active, local plasticity. We demonstrate successful learning of standard benchmark datasets, achieving competitive performance using both fully-connected and convolutional architectures, and show how our principle can be applied to detailed models of cortical microcircuitry. Furthermore, we study the robustness of our model to spatio-temporal substrate imperfections to demonstrate its feasibility for physical realization, be it in vivo or in silico.

View on arXiv PDF

Similar