LGNEMLJan 15, 2020

Learning a Single Neuron with Gradient Methods

arXiv:2001.05205v373 citations
AI Analysis

This work addresses a fundamental problem in machine learning theory for researchers, but it is incremental as it builds on prior results with broader assumptions.

The paper tackles the problem of learning a single neuron using gradient methods under more general assumptions than previous works, showing that some assumptions are necessary but proving positive guarantees under milder conditions.

We consider the fundamental problem of learning a single neuron $x \mapstoσ(w^\top x)$ using standard gradient methods. As opposed to previous works, which considered specific (and not always realistic) input distributions and activation functions $σ(\cdot)$, we ask whether a more general result is attainable, under milder assumptions. On the one hand, we show that some assumptions on the distribution and the activation function are necessary. On the other hand, we prove positive guarantees under mild assumptions, which go beyond those studied in the literature so far. We also point out and study the challenges in further strengthening and generalizing our results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes