Symmetry & Critical Points
This addresses optimization challenges in machine learning, particularly for neural networks, by providing a mathematical insight into symmetry and critical points, though it is incremental as it builds on existing theory.
The paper tackles the problem of efficiently minimizing invariant nonconvex functions, such as those in neural networks, by proving that symmetric critical points generically have adjacent symmetry-breaking ones, which aids optimization.
Critical points of an invariant function may or may not be symmetric. We prove, however, that if a symmetric critical point exists, those adjacent to it are generically symmetry breaking. This mathematical mechanism is shown to carry important implications for our ability to efficiently minimize invariant nonconvex functions, in particular those associated with neural networks.