LGAINEJun 28, 2023

Empirical Loss Landscape Analysis of Neural Network Activation Functions

arXiv:2306.16090v15 citationsh-index: 61Has Code
Originality Incremental advance
AI Analysis

It provides insights for neural architecture and training design by linking activation functions to loss landscape properties, but is incremental as it builds on prior work.

This study empirically analyzed how different activation functions (tanh, ReLU, ELU) affect neural network loss landscapes, finding that ReLU yields the most convex landscape and ELU the least flat landscape with superior generalization performance.

Activation functions play a significant role in neural network design by enabling non-linearity. The choice of activation function was previously shown to influence the properties of the resulting loss landscape. Understanding the relationship between activation functions and loss landscape properties is important for neural architecture and training algorithm design. This study empirically investigates neural network loss landscapes associated with hyperbolic tangent, rectified linear unit, and exponential linear unit activation functions. Rectified linear unit is shown to yield the most convex loss landscape, and exponential linear unit is shown to yield the least flat loss landscape, and to exhibit superior generalisation performance. The presence of wide and narrow valleys in the loss landscape is established for all activation functions, and the narrow valleys are shown to correlate with saturated neurons and implicitly regularised network configurations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes