LGCVNEMay 30, 2016

Parametric Exponential Linear Unit for Deep Convolutional Neural Networks

arXiv:1605.09332v4221 citations
Originality Incremental advance
AI Analysis

This work addresses the need for automated parameter tuning in activation functions for deep learning practitioners, offering an incremental improvement over existing ELU methods.

The paper tackles the problem of manually setting the parameter in the Exponential Linear Unit (ELU) activation function for CNNs by proposing a learnable Parametric ELU (PELU), which improves performance with up to a 7.28% relative error reduction on ImageNet using the NiN network and only a 0.0003% parameter increase.

Object recognition is an important task for improving the ability of visual systems to perform complex scene understanding. Recently, the Exponential Linear Unit (ELU) has been proposed as a key component for managing bias shift in Convolutional Neural Networks (CNNs), but defines a parameter that must be set by hand. In this paper, we propose learning a parameterization of ELU in order to learn the proper activation shape at each layer in the CNNs. Our results on the MNIST, CIFAR-10/100 and ImageNet datasets using the NiN, Overfeat, All-CNN and ResNet networks indicate that our proposed Parametric ELU (PELU) has better performances than the non-parametric ELU. We have observed as much as a 7.28% relative error improvement on ImageNet with the NiN network, with only 0.0003% parameter increase. Our visual examination of the non-linear behaviors adopted by Vgg using PELU shows that the network took advantage of the added flexibility by learning different activations at different layers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes