Geometric Understanding of Deep Learning
This work addresses a foundational problem for the AI/ML community by providing a theoretical framework to explain deep learning's effectiveness, though it appears incremental in building on existing geometric and manifold theories.
The paper tackles the lack of theoretical understanding in deep learning by proposing a geometric view, showing that deep learning succeeds by learning low-dimensional manifolds and their probability distributions, and it introduces concepts like rectified linear complexity to analyze learning capabilities and limitations.
Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.