LGOCApr 13, 2022

Approximation of Lipschitz Functions using Deep Spline Neural Networks

arXiv:2204.06233v119 citationsh-index: 104
Originality Highly original
AI Analysis

This addresses a bottleneck in machine learning for applications requiring Lipschitz constraints, offering a theoretical improvement over ReLU networks.

The paper tackles the challenge of designing expressive Lipschitz-constrained neural networks by proposing learnable spline activation functions with at least 3 linear regions, proving this choice is optimal and at least as expressive as existing methods.

Lipschitz-constrained neural networks have many applications in machine learning. Since designing and training expressive Lipschitz-constrained networks is very challenging, there is a need for improved methods and a better theoretical understanding. Unfortunately, it turns out that ReLU networks have provable disadvantages in this setting. Hence, we propose to use learnable spline activation functions with at least 3 linear regions instead. We prove that this choice is optimal among all component-wise $1$-Lipschitz activation functions in the sense that no other weight constrained architecture can approximate a larger class of functions. Additionally, this choice is at least as expressive as the recently introduced non component-wise Groupsort activation function for spectral-norm-constrained weights. Previously published numerical results support our theoretical findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes