Diverse super-resolution with pretrained deep hiererarchical VAEs
This addresses the need for efficient and high-quality diverse image super-resolution, particularly for face images, though it appears incremental as it builds on existing VAE and inverse problem techniques.
The paper tackles the problem of generating diverse high-resolution images from low-resolution inputs by using a pretrained hierarchical variational autoencoder as a prior, achieving a favorable trade-off between computational efficiency and sample quality in face super-resolution.
We investigate the problem of producing diverse solutions to an image super-resolution problem. From a probabilistic perspective, this can be done by sampling from the posterior distribution of an inverse problem, which requires the definition of a prior distribution on the high-resolution images. In this work, we propose to use a pretrained hierarchical variational autoencoder (HVAE) as a prior. We train a lightweight stochastic encoder to encode low-resolution images in the latent space of a pretrained HVAE. At inference, we combine the low-resolution encoder and the pretrained generative model to super-resolve an image. We demonstrate on the task of face super-resolution that our method provides an advantageous trade-off between the computational efficiency of conditional normalizing flows techniques and the sample quality of diffusion based methods.