LGApr 25, 2022

Trainable Compound Activation Functions for Machine Learning

arXiv:2204.12920v12 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the problem of inefficient activation functions for machine learning practitioners, offering an incremental improvement by combining existing methods into a novel hybrid approach.

The paper tackles the limitation of simple activation functions in neural networks by introducing trainable compound activation functions (TCAs), which improve network effectiveness with fewer parameters and enhance generative models by better estimating data distributions. Experiments on RBMs, DBNs, PBNs, and VAEs demonstrate improved performance.

Activation functions (AF) are necessary components of neural networks that allow approximation of functions, but AFs in current use are usually simple monotonically increasing functions. In this paper, we propose trainable compound AF (TCA) composed of a sum of shifted and scaled simple AFs. TCAs increase the effectiveness of networks with fewer parameters compared to added layers. TCAs have a special interpretation in generative networks because they effectively estimate the marginal distributions of each dimension of the data using a mixture distribution, reducing modality and making linear dimension reduction more effective. When used in restricted Boltzmann machines (RBMs), they result in a novel type of RBM with mixture-based stochastic units. Improved performance is demonstrated in experiments using RBMs, deep belief networks (DBN), projected belief networks (PBN), and variational auto-encoders (VAE).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes