Deep Asymmetric Networks with a Set of Node-wise Variant Activation Functions
This work addresses the need for efficient neural network compression and interpretability in machine learning, though it appears incremental as it builds on existing network architectures.
The paper tackles the problem of feature importance learning and network pruning by introducing deep asymmetric networks with node-wise variant activation functions, which sort features by importance based on node indices, enabling pruning of less important nodes without performance loss.
This work presents deep asymmetric networks with a set of node-wise variant activation functions. The nodes' sensitivities are affected by activation function selections such that the nodes with smaller indices become increasingly more sensitive. As a result, features learned by the nodes are sorted by the node indices in the order of their importance. Asymmetric networks not only learn input features but also the importance of those features. Nodes of lesser importance in asymmetric networks can be pruned to reduce the complexity of the networks, and the pruned networks can be retrained without incurring performance losses. We validate the feature-sorting property using both shallow and deep asymmetric networks as well as deep asymmetric networks transferred from famous networks.