Maximum and Leaky Maximum Propagation
This work addresses a specific architectural improvement for neural network design, offering incremental enhancements over existing residual connections.
The paper tackles the problem of improving neural network training by proposing an alternative to residual connections that uses maximum or leaky maximum propagation instead of addition, showing comparable performance on public datasets with benefits like better generalization with constant batch normalization and faster learning.
In this work, we present an alternative to conventional residual connections, which is inspired by maxout nets. This means that instead of the addition in residual connections, our approach only propagates the maximum value or, in the leaky formulation, propagates a percentage of both. In our evaluation, we show on different public data sets that the presented approaches are comparable to the residual connections and have other interesting properties, such as better generalization with a constant batch normalization, faster learning, and also the possibility to generalize without additional activation functions. In addition, the proposed approaches work very well if ensembles together with residual networks are formed. https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FMaximumPropagation&mode=list