CVLGAug 9, 2022

Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation

Georgia Tech
arXiv:2208.04554v113 citationsh-index: 33
Originality Incremental advance
AI Analysis

This work addresses image processing tasks for computer vision applications, offering incremental improvements in efficiency and quality over existing methods.

The authors tackled image reconstruction and generation by proposing HR-VQVAE, a hierarchical residual learning method based on vector quantized variational autoencoders, which achieved less distortion in reconstruction and outperformed state-of-the-art generative models in generating high-quality, diverse images.

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of the residual from previous layers through a vector quantized encoder. Furthermore, the representations at each layer are hierarchically linked to those at previous layers. We evaluate our method on the tasks of image reconstruction and generation. Experimental results demonstrate that the discrete representations learned by HR-VQVAE enable the decoder to reconstruct high-quality images with less distortion than the baseline methods, namely VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency of the learned representations. The hierarchical nature of HR-VQVAE i) reduces the decoding search time, making the method particularly suitable for high-load tasks and ii) allows to increase the codebook size without incurring the codebook collapse problem.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes