LGMLMay 25, 2023

Most Neural Networks Are Almost Learnable

arXiv:2305.16508v3
Originality Highly original
AI Analysis

This provides a theoretical guarantee for efficiently learning most neural networks, addressing a fundamental challenge in machine learning theory.

The authors tackled the problem of learning random constant-depth neural networks by presenting a polynomial-time approximation scheme (PTAS) that achieves additive error ε with time and sample complexity polynomial in network size, and quasi-polynomial for certain activations.

We present a PTAS for learning random constant-depth networks. We show that for any fixed $ε>0$ and depth $i$, there is a poly-time algorithm that for any distribution on $\sqrt{d} \cdot \mathbb{S}^{d-1}$ learns random Xavier networks of depth $i$, up to an additive error of $ε$. The algorithm runs in time and sample complexity of $(\bar{d})^{\mathrm{poly}(ε^{-1})}$, where $\bar d$ is the size of the network. For some cases of sigmoid and ReLU-like activations the bound can be improved to $(\bar{d})^{\mathrm{polylog}(ε^{-1})}$, resulting in a quasi-poly-time algorithm for learning constant depth random networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes