NEOct 19, 2018

Leveraging Product as an Activation Function in Deep Networks

arXiv:1810.08578v1
Originality Incremental advance
AI Analysis

This addresses a training bottleneck for neural networks with product units, offering an incremental improvement for machine learning practitioners.

The paper tackled the difficulty of training product unit neural networks (PUNNs) by introducing windowed product unit neural networks (WPUNNs), which use product as a nonlinearity with windowing to tame gradients, achieving performance comparable to ReLU on MNIST and LSTM in recurrent networks.

Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn effectively, solving the problems faced by PUNNs. WPUNNs use product layers between traditional sum layers, capturing the representational power of product units and using the product itself as a nonlinearity. We find the result that this method works as well as traditional nonlinearities like ReLU on the MNIST dataset. We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes