Formalising the Use of the Activation Function in Neural Inference

arXiv:2102.04896v3
AI Analysis

This work provides a foundational theoretical justification for the use and performance of activation functions in neural networks, which could impact researchers in theoretical machine learning and neuroscience.

This paper investigates the role of activation functions in neural networks by modeling biological neural firing as a phase transition from statistical physics. They show that artificial neurons are mean-field models of biological membrane dynamics, which allows them to formalize the activation function's role in perceptron learning and recover known activation functions as special cases.

We investigate how the activation function can be used to describe neural firing in an abstract way, and in turn, why it works well in artificial neural networks. We discuss how a spike in a biological neurone belongs to a particular universality class of phase transitions in statistical physics. We then show that the artificial neurone is, mathematically, a mean field model of biological neural membrane dynamics, which arises from modelling spiking as a phase transition. This allows us to treat selective neural firing in an abstract way, and formalise the role of the activation function in perceptron learning. The resultant statistical physical model allows us to recover the expressions for some known activation functions as various special cases. Along with deriving this model and specifying the analogous neural case, we analyse the phase transition to understand the physics of neural network learning. Together, it is shown that there is not only a biological meaning, but a physical justification, for the emergence and performance of typical activation functions; implications for neural learning and inference are also discussed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes