LGNEOct 15, 2020

Review and Comparison of Commonly Used Activation Functions for Deep Neural Networks

arXiv:2010.09458v1384 citations
Originality Synthesis-oriented
AI Analysis

It addresses the problem of selecting appropriate activation functions for neural network performance, but it is incremental as it synthesizes existing knowledge without introducing new methods.

This paper reviews and compares commonly used activation functions in deep neural networks, such as swish, ReLU, and Sigmoid, by evaluating their properties, pros, and cons to provide recommendations for their application.

The primary neural networks decision-making units are activation functions. Moreover, they evaluate the output of networks neural node; thus, they are essential for the performance of the whole network. Hence, it is critical to choose the most appropriate activation function in neural networks calculation. Acharya et al. (2018) suggest that numerous recipes have been formulated over the years, though some of them are considered deprecated these days since they are unable to operate properly under some conditions. These functions have a variety of characteristics, which are deemed essential to successfully learning. Their monotonicity, individual derivatives, and finite of their range are some of these characteristics (Bach 2017). This research paper will evaluate the commonly used additive functions, such as swish, ReLU, Sigmoid, and so forth. This will be followed by their properties, own cons and pros, and particular formula application recommendations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes