N-ReLU: Zero-Mean Stochastic Extension of ReLU
This addresses optimization robustness in deep neural networks for practitioners, but it is incremental as it builds on existing activation functions.
The paper tackled the problem of dead neurons in ReLU activation functions by introducing N-ReLU, a zero-mean stochastic extension that replaces negative activations with Gaussian noise, achieving comparable or slightly better accuracy than standard activations on MNIST with no dead neurons observed.
Activation functions are fundamental for enabling nonlinear representations in deep neural networks. However, the standard rectified linear unit (ReLU) often suffers from inactive or "dead" neurons caused by its hard zero cutoff. To address this issue, we introduce N-ReLU (Noise-ReLU), a zero-mean stochastic extension of ReLU that replaces negative activations with Gaussian noise while preserving the same expected output. This expectation-aligned formulation maintains gradient flow in inactive regions and acts as an annealing-style regularizer during training. Experiments on the MNIST dataset using both multilayer perceptron (MLP) and convolutional neural network (CNN) architectures show that N-ReLU achieves accuracy comparable to or slightly exceeding that of ReLU, LeakyReLU, PReLU, GELU, and RReLU at moderate noise levels (sigma = 0.05-0.10), with stable convergence and no dead neurons observed. These results demonstrate that lightweight Gaussian noise injection offers a simple yet effective mechanism to enhance optimization robustness without modifying network structures or introducing additional parameters.