Sparsity-Inducing Categorical Prior Improves Robustness of the Information Bottleneck
This work addresses robustness in representation learning for machine learning practitioners, offering an incremental improvement over existing methods.
The paper tackles the inflexibility of fixed-dimensional priors in the information bottleneck framework by introducing a sparsity-inducing categorical prior that allows each data point to learn its own dimension distribution, resulting in improved accuracy and robustness across MNIST, CIFAR-10, and ImageNet datasets.
The information bottleneck framework provides a systematic approach to learning representations that compress nuisance information in the input and extract semantically meaningful information about predictions. However, the choice of a prior distribution that fixes the dimensionality across all the data can restrict the flexibility of this approach for learning robust representations. We present a novel sparsity-inducing spike-slab categorical prior that uses sparsity as a mechanism to provide the flexibility that allows each data point to learn its own dimension distribution. In addition, it provides a mechanism for learning a joint distribution of the latent variable and the sparsity and hence can account for the complete uncertainty in the latent space. Through a series of experiments using in-distribution and out-of-distribution learning scenarios on the MNIST, CIFAR-10, and ImageNet data, we show that the proposed approach improves accuracy and robustness compared to traditional fixed-dimensional priors, as well as other sparsity induction mechanisms for latent variable models proposed in the literature.