Toward Architecture-Agnostic Local Control of Posterior Collapse in VAEs
This addresses the problem of limited generative diversity in VAEs for researchers and practitioners, offering an incremental improvement over prior methods that required structural constraints.
The paper tackles posterior collapse in VAEs by proposing a Latent Reconstruction loss that controls collapse without architectural constraints, achieving improved sample diversity across multiple datasets including MNIST, CelebA, and FFHQ.
Variational autoencoders (VAEs), one of the most widely used generative models, are known to suffer from posterior collapse, a phenomenon that reduces the diversity of generated samples. To avoid posterior collapse, many prior works have tried to control the influence of regularization loss. However, the trade-off between reconstruction and regularization is not satisfactory. For this reason, several methods have been proposed to guarantee latent identifiability, which is the key to avoiding posterior collapse. However, they require structural constraints on the network architecture. For further clarification, we define local posterior collapse to reflect the importance of individual sample points in the data space and to relax the network constraint. Then, we propose Latent Reconstruction(LR) loss, which is inspired by mathematical properties of injective and composite functions, to control posterior collapse without restriction to a specific architecture. We experimentally evaluate our approach, which controls posterior collapse on varied datasets such as MNIST, fashionMNIST, Omniglot, CelebA, and FFHQ.