LG MLDec 20, 2020

Recent advances in deep learning theory

arXiv:2012.10931v220.457 citations

Originality Synthesis-oriented

AI Analysis

This paper provides a structured overview of theoretical foundations for researchers and practitioners in deep learning, addressing the challenge of disorganized literature.

This paper reviews and organizes recent advances in deep learning theory, categorizing the literature into six groups including complexity and capacity, stochastic differential equations, loss landscape geometry, over-parameterization, special network architectures, and ethics/security.

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

View on arXiv PDF

Similar