CVLGIVAug 9, 2022

Disentangled Representation Learning Using ($β$-)VAE and GAN

arXiv:2208.04549v1h-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses feature disentanglement in image data for computer vision applications, but it is incremental as it builds on existing VAE and GAN methods.

The paper tackled disentangled representation learning for images with features like shape, size, rotation, and position using a VAE combined with a GAN, resulting in improved image reconstruction quality through disruption experiments on the hidden vector dimensions.

Given a dataset of images containing different objects with different features such as shape, size, rotation, and x-y position; and a Variational Autoencoder (VAE); creating a disentangled encoding of these features in the hidden space vector of the VAE was the task of interest in this paper. The dSprite dataset provided the desired features for the required experiments in this research. After training the VAE combined with a Generative Adversarial Network (GAN), each dimension of the hidden vector was disrupted to explore the disentanglement in each dimension. Note that the GAN was used to improve the quality of output image reconstruction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes