CV LGSep 23, 2024

Disentanglement with Factor Quantized Variational Autoencoders

Gulcin Baykal, Melih Kandemir, Gozde Unal

arXiv:2409.14851v33.71 citationsh-index: 17Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of learning independent generative factors in latent representations for machine learning researchers, presenting an incremental improvement over existing disentanglement approaches.

The paper tackles disentangled representation learning without ground truth factor information by proposing a discrete variational autoencoder with scalar quantization and a total correlation term, achieving better disentanglement metrics (DCI and InfoMEC) and improved reconstruction compared to prior methods.

Disentangled representation learning aims to represent the underlying generative factors of a dataset in a latent representation independently of one another. In our work, we propose a discrete variational autoencoder (VAE) based model where the ground truth information about the generative factors are not provided to the model. We demonstrate the advantages of learning discrete representations over learning continuous representations in facilitating disentanglement. Furthermore, we propose incorporating an inductive bias into the model to further enhance disentanglement. Precisely, we propose scalar quantization of the latent variables in a latent representation with scalar values from a global codebook, and we add a total correlation term to the optimization as an inductive bias. Our method called FactorQVAE combines optimization based disentanglement approaches with discrete representation learning, and it outperforms the former disentanglement methods in terms of two disentanglement metrics (DCI and InfoMEC) while improving the reconstruction performance. Our code can be found at https://github.com/ituvisionlab/FactorQVAE.

View on arXiv PDF Code

Similar