Group-Wise Optimization for Self-Extensible Codebooks in Vector Quantized Models
This addresses codebook inefficiencies in vector quantized models for machine learning researchers, offering incremental improvements in reconstruction quality and flexibility.
The paper tackles codebook collapse and limited learning in Vector Quantized Variational Autoencoders by proposing Group-VQ, which uses group-wise optimization to improve the trade-off between codebook utilization and reconstruction performance, and introduces a training-free resampling method for flexible codebook size adjustment, showing improved reconstruction metrics in image experiments.
Vector Quantized Variational Autoencoders (VQ-VAEs) leverage self-supervised learning through reconstruction tasks to represent continuous vectors using the closest vectors in a codebook. However, issues such as codebook collapse persist in the VQ model. To address these issues, existing approaches employ implicit static codebooks or jointly optimize the entire codebook, but these methods constrain the codebook's learning capability, leading to reduced reconstruction quality. In this paper, we propose Group-VQ, which performs group-wise optimization on the codebook. Each group is optimized independently, with joint optimization performed within groups. This approach improves the trade-off between codebook utilization and reconstruction performance. Additionally, we introduce a training-free codebook resampling method, allowing post-training adjustment of the codebook size. In image reconstruction experiments under various settings, Group-VQ demonstrates improved performance on reconstruction metrics. And the post-training codebook sampling method achieves the desired flexibility in adjusting the codebook size.