LGAIMATH-PHOCMLMay 11, 2024

Interpretable global minima of deep ReLU neural networks on sequentially separable data

arXiv:2405.07098v38 citationsh-index: 3
Originality Incremental advance
AI Analysis

This provides theoretical insights into interpretable global minima for deep ReLU networks, but it is incremental as it applies only to sequentially separable data.

The authors tackled the problem of explicitly constructing zero-loss neural network classifiers for specific data configurations, achieving global minimizers with as few as Q(M+2) parameters for Q classes in ℝ^M.

We explicitly construct zero loss neural network classifiers. We write the weight matrices and bias vectors in terms of cumulative parameters, which determine truncation maps acting recursively on input space. The configurations for the training data considered are (i) sufficiently small, well separated clusters corresponding to each class, and (ii) equivalence classes which are sequentially linearly separable. In the best case, for $Q$ classes of data in $\mathbb{R}^M$, global minimizers can be described with $Q(M+2)$ parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes