LGMLDec 16, 2019

Realization of spatial sparseness by deep ReLU nets with massive data

arXiv:1912.07464v126 citations
Originality Incremental advance
AI Analysis

This provides theoretical justification for the role of massive data in deep learning, which is incremental to existing work on depth and structure.

The paper tackles the problem of understanding why deep learning works well with big data by rigorously verifying the importance of massive data in learning spatially sparse and smooth functions, establishing a novel sampling theorem that shows the necessity of massive data and proving that deep nets achieve optimal learning rates.

The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality. The depth, structure, and massive size of the data are recognized to be three key ingredients for deep learning. Most of the recent theoretical studies for deep learning focus on the necessity and advantages of depth and structures of neural networks. In this paper, we aim at rigorous verification of the importance of massive data in embodying the out-performance of deep learning. To approximate and learn spatially sparse and smooth functions, we establish a novel sampling theorem in learning theory to show the necessity of massive data. We then prove that implementing the classical empirical risk minimization on some deep nets facilitates in realization of the optimal learning rates derived in the sampling theorem. This perhaps explains why deep learning performs so well in the era of big data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes