NELGMar 18, 2020

A Survey on Activation Functions and their relation with Xavier and He Normal Initialization

arXiv:2004.06632v194 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper that synthesizes existing knowledge on activation functions and initialization methods for researchers in deep learning.

This survey examines the properties that make activation functions effective in neural networks and explores the fundamental connections between activation functions and the widely used Xavier and He normal initialization methods, covering functions like sigmoid, tanh, ReLU, LReLU, and PReLU.

In artificial neural network, the activation function and the weight initialization method play important roles in training and performance of a neural network. The question arises is what properties of a function are important/necessary for being a well-performing activation function. Also, the most widely used weight initialization methods - Xavier and He normal initialization have fundamental connection with activation function. This survey discusses the important/necessary properties of activation function and the most widely used activation functions (sigmoid, tanh, ReLU, LReLU and PReLU). This survey also explores the relationship between these activation functions and the two weight initialization methods - Xavier and He normal initialization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes