DIS-NNLGMar 10, 2021

Mean-field methods and algorithmic perspectives for high-dimensional machine learning

arXiv:2103.05945v1
AI Analysis

This work addresses theoretical bottlenecks in high-dimensional ML analysis, offering incremental insights into algorithmic vs. statistical thresholds for researchers in statistical physics and machine learning theory.

The manuscript tackles the challenge of analyzing machine learning algorithms with many interacting variables by applying statistical physics tools, specifically mean-field methods and the replica-message passing connection, to derive phase diagrams and thresholds for synthetic tasks like committee machines and perceptrons.

The main difficulty that arises in the analysis of most machine learning algorithms is to handle, analytically and numerically, a large number of interacting random variables. In this Ph.D manuscript, we revisit an approach based on the tools of statistical physics of disordered systems. Developed through a rich literature, they have been precisely designed to infer the macroscopic behavior of a large number of particles from their microscopic interactions. At the heart of this work, we strongly capitalize on the deep connection between the replica method and message passing algorithms in order to shed light on the phase diagrams of various theoretical models, with an emphasis on the potential differences between statistical and algorithmic thresholds. We essentially focus on synthetic tasks and data generated in the teacher-student paradigm. In particular, we apply these mean-field methods to the Bayes-optimal analysis of committee machines, to the worst-case analysis of Rademacher generalization bounds for perceptrons, and to empirical risk minimization in the context of generalized linear models. Finally, we develop a framework to analyze estimation models with structured prior informations, produced for instance by deep neural networks based generative models with random weights.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes