AILGNCOct 12, 2023

Neural Sampling in Hierarchical Exponential-family Energy-based Models

arXiv:2310.08431v24 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the problem of slow and complex inference in energy-based models for the machine learning and computational neuroscience communities, offering an incremental improvement by integrating neural adaptation and hierarchical decomposition.

The study tackled the challenge of inference and learning in energy-based models by introducing the Hierarchical Exponential-family Energy-based (HEE) model, which decomposes the partition function to avoid the negative phase, leading to localized learning and easier convergence, with neural adaptation accelerating inference and achieving performance on par with other EBMs on natural image datasets.

Bayesian brain theory suggests that the brain employs generative models to understand the external world. The sampling-based perspective posits that the brain infers the posterior distribution through samples of stochastic neuronal responses. Additionally, the brain continually updates its generative model to approach the true distribution of the external world. In this study, we introduce the Hierarchical Exponential-family Energy-based (HEE) model, which captures the dynamics of inference and learning. In the HEE model, we decompose the partition function into individual layers and leverage a group of neurons with shorter time constants to sample the gradient of the decomposed normalization term. This allows our model to estimate the partition function and perform inference simultaneously, circumventing the negative phase encountered in conventional energy-based models (EBMs). As a result, the learning process is localized both in time and space, and the model is easy to converge. To match the brain's rapid computation, we demonstrate that neural adaptation can serve as a momentum term, significantly accelerating the inference process. On natural image datasets, our model exhibits representations akin to those observed in the biological visual system. Furthermore, for the machine learning community, our model can generate observations through joint or marginal generation. We show that marginal generation outperforms joint generation and achieves performance on par with other EBMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes