DIS-NNSTAT-MECHLGJun 12, 2025

On the role of non-linear latent features in bipartite generative neural networks

arXiv:2506.10552v11 citationsh-index: 17SciPost Physics
Originality Incremental advance
AI Analysis

This work addresses memory limitations in energy-based neural networks for researchers in statistical physics and machine learning, though it appears incremental in modifying existing RBM architectures.

The researchers investigated how different hidden unit activation functions affect the memory capacity of Restricted Boltzmann Machines, finding that standard binary units limit critical capacity while richer priors like ReLU-like activations restore ordered retrieval phases and improve recall performance.

We investigate the phase diagram and memory retrieval capabilities of bipartite energy-based neural networks, namely Restricted Boltzmann Machines (RBMs), as a function of the prior distribution imposed on their hidden units - including binary, multi-state, and ReLU-like activations. Drawing connections to the Hopfield model and employing analytical tools from statistical physics of disordered systems, we explore how the architectural choices and activation functions shape the thermodynamic properties of these models. Our analysis reveals that standard RBMs with binary hidden nodes and extensive connectivity suffer from reduced critical capacity, limiting their effectiveness as associative memories. To address this, we examine several modifications, such as introducing local biases and adopting richer hidden unit priors. These adjustments restore ordered retrieval phases and markedly improve recall performance, even at finite temperatures. Our theoretical findings, supported by finite-size Monte Carlo simulations, highlight the importance of hidden unit design in enhancing the expressive power of RBMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes