A Modified Activation Function with Improved Run-Times For Neural Networks
This work addresses training efficiency for neural network practitioners, but appears incremental as it modifies an existing activation function rather than introducing a fundamentally new approach.
The paper tackles the problem of vanishing gradients and slow training in neural networks by proposing a modified hyperbolic tangent activation function that uses integer approximation of Euler's number and adaptive normalization. The result shows lower run-times and improved training speed-ups and accuracies on both hypothetical and real-world datasets.
In this paper we present a modified version of the Hyperbolic Tangent Activation Function as a learning unit generator for neural networks. The function uses an integer calibration constant as an approximation to the Euler number, e, based on a quadratic Real Number Formula (RNF) algorithm and an adaptive normalization constraint on the input activations to avoid the vanishing gradient. We demonstrate the effectiveness of the proposed modification using a hypothetical and real world dataset and show that lower run-times can be achieved by learning algorithms using this function leading to improved speed-ups and learning accuracies during training.