Layer-wise Learning of Stochastic Neural Networks with Information Bottleneck
This work addresses a theoretical gap in applying IB to neural networks with multiple layers, which could improve training methods for researchers in machine learning, but it appears incremental as it builds directly on existing IB theory.
The paper tackles the problem of extending the Information Bottleneck (IB) to multiple bottlenecks for neural networks, proposing Information Multi-Bottlenecks (IMBs) and showing that optimality is not simultaneously achievable for stochastic encoders, leading to a compromised scheme that generalizes maximum likelihood estimate. It demonstrates effectiveness on classification tasks and adversarial robustness in MNIST and CIFAR10, though no concrete numbers are provided in the abstract.
Information Bottleneck (IB) is a generalization of rate-distortion theory that naturally incorporates compression and relevance trade-offs for learning. Though the original IB has been extensively studied, there has not been much understanding of multiple bottlenecks which better fit in the context of neural networks. In this work, we propose Information Multi-Bottlenecks (IMBs) as an extension of IB to multiple bottlenecks which has a direct application to training neural networks by considering layers as multiple bottlenecks and weights as parameterized encoders and decoders. We show that the multiple optimality of IMB is not simultaneously achievable for stochastic encoders. We thus propose a simple compromised scheme of IMB which in turn generalizes maximum likelihood estimate (MLE) principle in the context of stochastic neural networks. We demonstrate the effectiveness of IMB on classification tasks and adversarial robustness in MNIST and CIFAR10.