MLLGJun 11, 2019

Maximum Mean Discrepancy Gradient Flow

arXiv:1906.04370v2198 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of optimizing probability measures in machine learning, particularly for neural network transport, but is incremental as it builds on existing MMD and gradient flow frameworks.

The paper tackles the problem of constructing and analyzing a Wasserstein gradient flow for the Maximum Mean Discrepancy (MMD), providing conditions for its convergence to a global optimum and proposing a noise-injection regularization method with theoretical and empirical support.

We construct a Wasserstein gradient flow of the maximum mean discrepancy (MMD) and study its convergence properties. The MMD is an integral probability metric defined for a reproducing kernel Hilbert space (RKHS), and serves as a metric on probability measures for a sufficiently rich RKHS. We obtain conditions for convergence of the gradient flow towards a global optimum, that can be related to particle transport when optimizing neural networks. We also propose a way to regularize this MMD flow, based on an injection of noise in the gradient. This algorithmic fix comes with theoretical and empirical evidence. The practical implementation of the flow is straightforward, since both the MMD and its gradient have simple closed-form expressions, which can be easily estimated with samples.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes