ML LGMar 24, 2018

Gradient descent in Gaussian random fields as a toy model for high-dimensional optimisation in deep learning

Mariano Chouza, Stephen Roberts, Stefan Zohren

arXiv:1803.09119v14.23 citations

Originality Synthesis-oriented

AI Analysis

This provides a theoretical toy model for understanding optimization in deep learning, but it is incremental as it builds on existing random field theories without direct application to real-world problems.

The authors modeled high-dimensional optimization loss functions using Gaussian random fields to analyze gradient descent behavior, deriving analytic expressions for loss improvement moments and proving asymptotic normality in high dimensions.

In this paper we model the loss function of high-dimensional optimization problems by a Gaussian random field, or equivalently a Gaussian process. Our aim is to study gradient descent in such loss functions or energy landscapes and compare it to results obtained from real high-dimensional optimization problems such as encountered in deep learning. In particular, we analyze the distribution of the improved loss function after a step of gradient descent, provide analytic expressions for the moments as well as prove asymptotic normality as the dimension of the parameter space becomes large. Moreover, we compare this with the expectation of the global minimum of the landscape obtained by means of the Euler characteristic of excursion sets. Besides complementing our analytical findings with numerical results from simulated Gaussian random fields, we also compare it to loss functions obtained from optimisation problems on synthetic and real data sets by proposing a "black box" random field toy-model for a deep neural network loss function.

View on arXiv PDF

Similar