A simple and efficient architecture for trainable activation functions
This work addresses the challenge of designing efficient and trainable activation functions for neural network practitioners, though it appears incremental compared to existing methods.
The authors tackled the problem of learning optimal activation functions in neural networks by proposing a simple architecture that adds small local subnetworks, achieving better results than predefined functions without significantly increasing parameters.
Learning automatically the best activation function for the task is an active topic in neural network research. At the moment, despite promising results, it is still difficult to determine a method for learning an activation function that is at the same time theoretically simple and easy to implement. Moreover, most of the methods proposed so far introduce new parameters or adopt different learning techniques. In this work we propose a simple method to obtain trained activation function which adds to the neural network local subnetworks with a small amount of neurons. Experiments show that this approach could lead to better result with respect to using a pre-defined activation function, without introducing a large amount of extra parameters that need to be learned.