FALGMLOct 18, 2024

A Lipschitz spaces view of infinitely wide shallow neural networks

arXiv:2410.14591v13 citationsh-index: 10
Originality Highly original
AI Analysis

This work provides foundational insights for researchers in machine learning theory, addressing stability and optimization in neural networks, though it is incremental in building on existing mean field and optimal transport frameworks.

The paper tackles the theoretical analysis of infinitely wide shallow neural networks by using a mean field parametrization with signed measures and Kantorovich-Rubinstein norms, leading to compactness results and applications like representer theorems and large data limits.

We revisit the mean field parametrization of shallow neural networks, using signed measures on unbounded parameter spaces and duality pairings that take into account the regularity and growth of activation functions. This setting directly leads to the use of unbalanced Kantorovich-Rubinstein norms defined by duality with Lipschitz functions, and of spaces of measures dual to those of continuous functions with controlled growth. These allow to make transparent the need for total variation and moment bounds or penalization to obtain existence of minimizers of variational formulations, under which we prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior. Further, the Kantorovich-Rubinstein setting enables us to combine the advantages of a completely linear parametrization and ensuing reproducing kernel Banach space framework with optimal transport insights. We showcase this synergy with representer theorems and uniform large data limits for empirical risk minimization, and in proposed formulations for distillation and fusion applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes