Statistical Foundation Behind Machine Learning and Its Impact on Computer Vision
It provides a theoretical perspective on current deep learning practices, but is incremental as it revisits existing statistical foundations without new empirical results.
This paper revisits the principle of uniform convergence in statistical learning to understand the core problem in deep learning, showing that large-scale pre-training in computer vision aims to reduce the gap between empirical and expected loss.
This paper revisits the principle of uniform convergence in statistical learning, discusses how it acts as the foundation behind machine learning, and attempts to gain a better understanding of the essential problem that current deep learning algorithms are solving. Using computer vision as an example domain in machine learning, the discussion shows that recent research trends in leveraging increasingly large-scale data to perform pre-training for representation learning are largely to reduce the discrepancy between a practically tractable empirical loss and its ultimately desired but intractable expected loss. Furthermore, this paper suggests a few future research directions, predicts the continued increase of data, and argues that more fundamental research is needed on robustness, interpretability, and reasoning capabilities of machine learning by incorporating structure and knowledge.