NELGMay 12, 2023

Saturated Non-Monotonic Activation Functions

arXiv:2305.07537v23 citations
AI Analysis

This work addresses activation function design for deep learning practitioners, offering incremental improvements by hybridizing existing methods.

The paper tackled the problem of non-monotonic activation functions altering positive inputs unnecessarily by proposing Saturated Gaussian Error Linear Units, which combine ReLU's positive portion with non-monotonic negative portions; results on CIFAR-100 show these functions outperform state-of-the-art baselines in image classification.

Activation functions are essential to deep learning networks. Popular and versatile activation functions are mostly monotonic functions, some non-monotonic activation functions are being explored and show promising performance. But by introducing non-monotonicity, they also alter the positive input, which is proved to be unnecessary by the success of ReLU and its variants. In this paper, we double down on the non-monotonic activation functions' development and propose the Saturated Gaussian Error Linear Units by combining the characteristics of ReLU and non-monotonic activation functions. We present three new activation functions built with our proposed method: SGELU, SSiLU, and SMish, which are composed of the negative portion of GELU, SiLU, and Mish, respectively, and ReLU's positive portion. The results of image classification experiments on CIFAR-100 indicate that our proposed activation functions are highly effective and outperform state-of-the-art baselines across multiple deep learning architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes