Dynamical loss functions shape landscape topography and improve learning in artificial neural networks
This work addresses the challenge of optimizing neural network training by modifying loss landscapes, offering an incremental improvement for machine learning practitioners.
The paper tackles the problem of improving learning in artificial neural networks by introducing dynamical loss functions that oscillate class contributions, which significantly improves validation accuracy for networks of varying sizes in a classification task.
Dynamical loss functions are derived from standard loss functions used in supervised classification tasks, but are modified so that the contribution from each class periodically increases and decreases. These oscillations globally alter the loss landscape without affecting the global minima. In this paper, we demonstrate how to transform cross-entropy and mean squared error into dynamical loss functions. We begin by discussing the impact of increasing the size of the neural network or the learning rate on the depth and sharpness of the minima that the system explores. Building on this intuition, we propose several versions of dynamical loss functions and use a simple classification problem where we can show how they significantly improve validation accuracy for networks of varying sizes. Finally, we explore how the landscape of these dynamical loss functions evolves during training, highlighting the emergence of instabilities that may be linked to edge-of-instability minimization.