DIS-NNLGPRMay 15, 2020

Minimax formula for the replica symmetric free energy of deep restricted Boltzmann machines

arXiv:2005.09424v212 citations
AI Analysis

This work addresses a theoretical problem in statistical physics and machine learning for researchers studying deep neural networks, but it appears incremental as it extends known results from two-layer to deep architectures.

The authors tackled the problem of computing the replica symmetric free energy for deep restricted Boltzmann machines with Gaussian random weights, and they conjectured that it can be expressed as a min-max formula, similar to the two-layer case.

We study the free energy of a most used deep architecture for restricted Boltzmann machines, where the layers are disposed in series. Assuming independent Gaussian distributed random weights, we show that the error term in the so-called replica symmetric sum rule can be optimised as a saddle point. This leads us to conjecture that in the replica symmetric approximation the free energy is given by a min max formula, which parallels the one achieved for two-layer case.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes