Permutative redundancy and uncertainty of the objective in deep learning
This tackles a foundational problem in deep learning optimization that affects all practitioners by highlighting inherent inefficiencies in traditional architectures, though it appears incremental as it builds on known symmetry issues.
The paper addresses the problem of numerous equivalent optima in deep learning due to permutative symmetry and uncertain objectives, showing that traditional architectures are polluted by an astronomical number of such optima, which become unattainable and create a complex optimization landscape as network size increases. It discusses remedies like forced pre-pruning and modular bio-inspired architectures to reduce these issues.
Implications of uncertain objective functions and permutative symmetry of traditional deep learning architectures are discussed. It is shown that traditional architectures are polluted by an astronomical number of equivalent global and local optima. Uncertainty of the objective makes local optima unattainable, and, as the size of the network grows, the global optimization landscape likely becomes a tangled web of valleys and ridges. Some remedies which reduce or eliminate ghost optima are discussed including forced pre-pruning, re-ordering, ortho-polynomial activations, and modular bio-inspired architectures.