LGDSMLJul 24, 2023

A faster and simpler algorithm for learning shallow networks

arXiv:2307.12496v19 citationsh-index: 23
Originality Incremental advance
AI Analysis

This provides a more efficient and streamlined solution for learning shallow networks, which is incremental but offers practical speed-ups for machine learning practitioners.

The paper tackles the problem of learning shallow neural networks with ReLU activations from Gaussian data, showing that a simpler one-stage algorithm achieves a runtime of (d/ε)^O(k^2), improving upon prior work that required multiple stages and quasipolynomial dependence on k.

We revisit the well-studied problem of learning a linear combination of $k$ ReLU activations given labeled examples drawn from the standard $d$-dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in $\text{poly}(d,1/\varepsilon)$ time when $k = O(1)$, where $\varepsilon$ is the target error. More precisely, their algorithm runs in time $(d/\varepsilon)^{\mathrm{quasipoly}(k)}$ and learns over multiple stages. Here we show that a much simpler one-stage version of their algorithm suffices, and moreover its runtime is only $(d/\varepsilon)^{O(k^2)}$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes