Neural Networks with Activation Networks
This work addresses the challenge of enhancing neural network efficiency for machine learning practitioners by offering a novel alternative to scaling model parameters, though it appears incremental as it builds on existing architectures.
The authors tackled the problem of improving neural network performance by introducing an adaptive activation method that uses auxiliary activation networks to learn feature interdependencies, resulting in significant performance improvements compared to baseline networks without increasing model size.
This work presents an adaptive activation method for neural networks that exploits the interdependency of features. Each pixel, node, and layer is assigned with a polynomial activation function, whose coefficients are provided by an auxiliary activation network. The activation of a feature depends on the features of neighboring pixels in a convolutional layer and other nodes in a dense layer. The dependency is learned from data by the activation networks. In our experiments, networks with activation networks provide significant performance improvement compared to the baseline networks on which they are built. The proposed method can be used to improve the network performance as an alternative to increasing the number of nodes and layers.