LG AIFeb 6, 2025

Multiple Invertible and Partial-Equivariant Function for Latent Vector Transformation to Enhance Disentanglement in VAEs

Hee-Jun Jung, Jaehyoung Jeong, Kangil Kim

arXiv:2502.03740v24.1h-index: 14

Originality Incremental advance

AI Analysis

This work addresses the core issue of enhancing disentanglement in VAEs for better understanding and re-use of trained information, representing an incremental improvement over existing methods.

The paper tackles the problem of vague inductive bias implementation in disentanglement learning for VAEs by proposing MIPE-transformation, which integrates invertible and partial-equivariant transformations and exponential family conversion, resulting in improved disentanglement performance on datasets like 3D Cars, 3D Shapes, and dSprites.

Disentanglement learning is a core issue for understanding and re-using trained information in Variational AutoEncoder (VAE), and effective inductive bias has been reported as a key factor. However, the actual implementation of such bias is still vague. In this paper, we propose a novel method, called Multiple Invertible and partial-equivariant transformation (MIPE-transformation), to inject inductive bias by 1) guaranteeing the invertibility of latent-to-latent vector transformation while preserving a certain portion of equivariance of input-to-latent vector transformation, called Invertible and partial-equivariant transformation (IPE-transformation), 2) extending the form of prior and posterior in VAE frameworks to an unrestricted form through a learnable conversion to an approximated exponential family, called Exponential Family conversion (EF-conversion), and 3) integrating multiple units of IPE-transformation and EF-conversion, and their training. In experiments on 3D Cars, 3D Shapes, and dSprites datasets, MIPE-transformation improves the disentanglement performance of state-of-the-art VAEs.

View on arXiv PDF

Similar