OCLGNAJun 23, 2022

Stochastic Langevin Differential Inclusions with Applications to Machine Learning

arXiv:2206.11533v35 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses foundational theoretical gaps for non-smooth optimization and sampling in machine learning, though it appears incremental as it extends existing smooth-case results to more general settings.

The paper tackles the problem of analyzing Langevin-type stochastic differential inclusions for non-smooth potentials, such as those arising from robust losses and ReLUs in machine learning, by proving strong existence of solutions and asymptotic minimization of the free-energy functional.

Stochastic differential equations of Langevin-diffusion form have received significant attention, thanks to their foundational role in both Bayesian sampling algorithms and optimization in machine learning. In the latter, they serve as a conceptual model of the stochastic gradient flow in training over-parameterized models. However, the literature typically assumes smoothness of the potential, whose gradient is the drift term. Nevertheless, there are many problems for which the potential function is not continuously differentiable, and hence the drift is not Lipschitz continuous everywhere. This is exemplified by robust losses and Rectified Linear Units in regression problems. In this paper, we show some foundational results regarding the flow and asymptotic properties of Langevin-type Stochastic Differential Inclusions under assumptions appropriate to the machine-learning settings. In particular, we show strong existence of the solution, as well as an asymptotic minimization of the canonical free-energy functional.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes