LGDSMay 15, 2017

Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond

arXiv:1705.05154v29 citations
Originality Incremental advance
AI Analysis

This work provides theoretical justification for a common implementation practice in machine learning, which is incremental but important for practitioners using these models.

The paper addresses the discrepancy between theoretical random updates and practical systematic scans in Markov chain Monte Carlo methods, specifically for bipartite models like Restricted/Deep Boltzmann Machines, showing that a layerwise alternating scan order has relaxation time no larger than random updates and proving this bound is asymptotically tight.

For Markov chain Monte Carlo methods, one of the greatest discrepancies between theory and system is the scan order - while most theoretical development on the mixing time analysis deals with random updates, real-world systems are implemented with systematic scans. We bridge this gap for models that exhibit a bipartite structure, including, most notably, the Restricted/Deep Boltzmann Machine. The de facto implementation for these models scans variables in a layerwise fashion. We show that the Gibbs sampler with a layerwise alternating scan order has its relaxation time (in terms of epochs) no larger than that of a random-update Gibbs sampler (in terms of variable updates). We also construct examples to show that this bound is asymptotically tight. Through standard inequalities, our result also implies a comparison on the mixing times.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes