LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations
This incremental improvement addresses efficiency and scalability issues in discrete representation learning for researchers and practitioners in machine learning.
The paper tackles the problem of codebook collapse in vector-quantized variational autoencoders by introducing learnable lattice vector quantization, which results in lower reconstruction errors, faster training, and constant parameter scaling compared to VQ-VAE.
In this paper we introduce learnable lattice vector quantization and demonstrate its effectiveness for learning discrete representations. Our method, termed LL-VQ-VAE, replaces the vector quantization layer in VQ-VAE with lattice-based discretization. The learnable lattice imposes a structure over all discrete embeddings, acting as a deterrent against codebook collapse, leading to high codebook utilization. Compared to VQ-VAE, our method obtains lower reconstruction errors under the same training conditions, trains in a fraction of the time, and with a constant number of parameters (equal to the embedding dimension $D$), making it a very scalable approach. We demonstrate these results on the FFHQ-1024 dataset and include FashionMNIST and Celeb-A.