The Global Landscape of Neural Networks: An Overview
This is an incremental overview article that synthesizes existing research on neural network loss landscapes, primarily for researchers in machine learning theory.
The paper reviews recent findings on the global landscape of neural networks, addressing concerns about non-convex loss functions by summarizing results on sub-optimal local minima, geometric properties like 'no bad basin', and empirical visualizations for practical networks.
One of the major concerns for neural network training is that the non-convexity of the associated loss functions may cause bad landscape. The recent success of neural networks suggests that their loss landscape is not too bad, but what specific results do we know about the landscape? In this article, we review recent findings and results on the global landscape of neural networks. First, we point out that wide neural nets may have sub-optimal local minima under certain assumptions. Second, we discuss a few rigorous results on the geometric properties of wide networks such as "no bad basin", and some modifications that eliminate sub-optimal local minima and/or decreasing paths to infinity. Third, we discuss visualization and empirical explorations of the landscape for practical neural nets. Finally, we briefly discuss some convergence results and their relation to landscape results.