CVMar 22, 2020

Dynamic ReLU

arXiv:2003.10027v2214 citations
AI Analysis

This improves activation functions for deep learning, particularly benefiting light-weight networks like MobileNetV2, though it is an incremental advancement over existing ReLU variants.

The paper tackles the static nature of ReLU activations in neural networks by proposing Dynamic ReLU (DY-ReLU), which adapts parameters based on input context, resulting in a 4.2% accuracy boost on ImageNet with only 5% extra FLOPs for MobileNetV2.

Rectified linear units (ReLU) are commonly used in deep neural networks. So far ReLU and its generalizations (non-parametric or parametric) are static, performing identically for all input samples. In this paper, we propose dynamic ReLU (DY-ReLU), a dynamic rectifier of which parameters are generated by a hyper function over all in-put elements. The key insight is that DY-ReLU encodes the global context into the hyper function, and adapts the piecewise linear activation function accordingly. Compared to its static counterpart, DY-ReLU has negligible extra computational cost, but significantly more representation capability, especially for light-weight neural networks. By simply using DY-ReLU for MobileNetV2, the top-1 accuracy on ImageNet classification is boosted from 72.0% to 76.2% with only 5% additional FLOPs.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes