The Landscape of Non-convex Empirical Risk with Degenerate Population Risk
This work addresses a theoretical bottleneck in machine learning for researchers dealing with degenerate optimization problems, but it is incremental as it builds on existing landscape analysis.
The paper tackles the problem of analyzing non-convex empirical risk landscapes when the population risk is degenerate, establishing a connection between their critical points without requiring the strongly Morse assumption. It applies this theory to matrix sensing and phase retrieval to infer empirical risk landscapes from population ones.
The landscape of empirical risk has been widely studied in a series of machine learning problems, including low-rank matrix factorization, matrix sensing, matrix completion, and phase retrieval. In this work, we focus on the situation where the corresponding population risk is a degenerate non-convex loss function, namely, the Hessian of the population risk can have zero eigenvalues. Instead of analyzing the non-convex empirical risk directly, we first study the landscape of the corresponding population risk, which is usually easier to characterize, and then build a connection between the landscape of the empirical risk and its population risk. In particular, we establish a correspondence between the critical points of the empirical risk and its population risk without the strongly Morse assumption, which is required in existing literature but not satisfied in degenerate scenarios. We also apply the theory to matrix sensing and phase retrieval to demonstrate how to infer the landscape of empirical risk from that of the corresponding population risk.