A Structured Variational Auto-encoder for Learning Deep Hierarchies of Sparse Features
This work addresses the problem of efficient deep generative modeling for researchers in machine learning, though it appears incremental as it builds on existing variational auto-encoder frameworks with modifications for sparsity and structure.
The authors tackled the challenge of learning deep hierarchical sparse features in natural images by introducing a generative model with rectified Gaussian units and a structured variational auto-encoder, enabling joint training of multiple layers without layerwise procedures.
In this note we present a generative model of natural images consisting of a deep hierarchy of layers of latent random variables, each of which follows a new type of distribution that we call rectified Gaussian. These rectified Gaussian units allow spike-and-slab type sparsity, while retaining the differentiability necessary for efficient stochastic gradient variational inference. To learn the parameters of the new model, we approximate the posterior of the latent variables with a variational auto-encoder. Rather than making the usual mean-field assumption however, the encoder parameterizes a new type of structured variational approximation that retains the prior dependencies of the generative model. Using this structured posterior approximation, we are able to perform joint training of deep models with many layers of latent random variables, without having to resort to stacking or other layerwise training procedures.