LGMLMay 30, 2019

Function approximation by deep networks

arXiv:1905.12882v225 citations
Originality Incremental advance
AI Analysis

This addresses the fundamental problem of function approximation in machine learning, providing theoretical justification for deep architectures, though it is incremental in building on existing network theory.

The paper demonstrates that deep networks outperform shallow networks in approximating functions with compositional structures, effectively mitigating the curse of dimensionality, with theoretical support from error propagation theorems.

We show that deep networks are better than shallow networks at approximating functions that can be expressed as a composition of functions described by a directed acyclic graph, because the deep networks can be designed to have the same compositional structure, while a shallow network cannot exploit this knowledge. Thus, the blessing of compositionality mitigates the curse of dimensionality. On the other hand, a theorem called good propagation of errors allows to `lift' theorems about shallow networks to those about deep networks with an appropriate choice of norms, smoothness, etc. We illustrate this in three contexts where each channel in the deep network calculates a spherical polynomial, a non-smooth ReLU network, or another zonal function network related closely with the ReLU network.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes