MLLGMAMay 10, 2019

The sharp, the flat and the shallow: Can weakly interacting agents learn to escape bad minima?

arXiv:1905.04121v11 citations
AI Analysis

This work addresses a foundational challenge in ML for improving generalization, though it is incremental as a first step towards understanding flat minima.

The paper tackles the problem of whether flat minima generalize better in machine learning by formalizing it as an optimization problem with weakly interacting agents, proposing an algorithmic framework based on extended stochastic gradient Langevin dynamics and illustrating its potential.

An open problem in machine learning is whether flat minima generalize better and how to compute such minima efficiently. This is a very challenging problem. As a first step towards understanding this question we formalize it as an optimization problem with weakly interacting agents. We review appropriate background material from the theory of stochastic processes and provide insights that are relevant to practitioners. We propose an algorithmic framework for an extended stochastic gradient Langevin dynamics and illustrate its potential. The paper is written as a tutorial, and presents an alternative use of multi-agent learning. Our primary focus is on the design of algorithms for machine learning applications; however the underlying mathematical framework is suitable for the understanding of large scale systems of agent based models that are popular in the social sciences, economics and finance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes