PEA: Improving the Performance of ReLU Networks for Free by Using Progressive Ensemble Activations
This provides a practical solution for deploying efficient models in constrained environments, though it is incremental in nature.
The paper tackles the problem of improving ReLU network performance in environments where only ReLU activations are supported, by using progressive ensemble activations during training that revert to ReLU-only at inference, resulting in 0.2-0.8% top-1 accuracy gain on ImageNet and a 0.34% mIOU boost on Cityscapes.
In recent years novel activation functions have been proposed to improve the performance of neural networks, and they show superior performance compared to the ReLU counterpart. However, there are environments, where the availability of complex activations is limited, and usually only the ReLU is supported. In this paper we propose methods that can be used to improve the performance of ReLU networks by using these efficient novel activations during model training. More specifically, we propose ensemble activations that are composed of the ReLU and one of these novel activations. Furthermore, the coefficients of the ensemble are neither fixed nor learned, but are progressively updated during the training process in a way that by the end of the training only the ReLU activations remain active in the network and the other activations can be removed. This means that in inference time the network contains ReLU activations only. We perform extensive evaluations on the ImageNet classification task using various compact network architectures and various novel activation functions. Results show 0.2-0.8% top-1 accuracy gain, which confirms the applicability of the proposed methods. Furthermore, we demonstrate the proposed methods on semantic segmentation and we boost the performance of a compact segmentation network by 0.34% mIOU on the Cityscapes dataset.