LGCVJan 17, 2024

Unsupervised Multiple Domain Translation through Controlled Disentanglement in Variational Autoencoder

arXiv:2401.09180v22 citationsh-index: 3ICASSP
Originality Incremental advance
AI Analysis

This addresses the problem of domain translation in computer vision for researchers and practitioners, offering a novel approach with better control, but it is incremental as it builds on existing VAE and disentanglement concepts.

The paper tackles unsupervised multiple domain translation without paired data by proposing a modified Variational Autoencoder with two disentangled latent variables, one for domain and one for other factors, and demonstrates improved performance on vision datasets compared to existing methods.

Unsupervised Multiple Domain Translation is the task of transforming data from one domain to other domains without having paired data to train the systems. Typically, methods based on Generative Adversarial Networks (GANs) are used to address this task. However, our proposal exclusively relies on a modified version of a Variational Autoencoder. This modification consists of the use of two latent variables disentangled in a controlled way by design. One of this latent variables is imposed to depend exclusively on the domain, while the other one must depend on the rest of the variability factors of the data. Additionally, the conditions imposed over the domain latent variable allow for better control and understanding of the latent space. We empirically demonstrate that our approach works on different vision datasets improving the performance of other well known methods. Finally, we prove that, indeed, one of the latent variables stores all the information related to the domain and the other one hardly contains any domain information.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes