Deep Maxout Network Gaussian Process
This work provides theoretical insights into infinite-width neural networks, potentially aiding in understanding and improving practical applications like Bayesian inference, though it is incremental in extending existing kernel methods to maxout activations.
The authors derived the equivalence between an infinite-width deep maxout network and a Gaussian process, characterizing its kernel with a compositional structure and connecting it to deep neural network kernels, with numerical results showing competitive Bayesian inference performance compared to finite-width counterparts.
Study of neural networks with infinite width is important for better understanding of the neural network in practical application. In this work, we derive the equivalence of the deep, infinite-width maxout network and the Gaussian process (GP) and characterize the maxout kernel with a compositional structure. Moreover, we build up the connection between our deep maxout network kernel and deep neural network kernels. We also give an efficient numerical implementation of our kernel which can be adapted to any maxout rank. Numerical results show that doing Bayesian inference based on the deep maxout network kernel can lead to competitive results compared with their finite-width counterparts and deep neural network kernels. This enlightens us that the maxout activation may also be incorporated into other infinite-width neural network structures such as the convolutional neural network (CNN).