CVLGJul 22, 2019

Product of Orthogonal Spheres Parameterization for Disentangled Representation Learning

arXiv:1907.09554v125 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of learning interpretable and controllable representations for data generation, with incremental improvements in disentanglement performance.

The paper tackles disentangled representation learning by proposing a latent representation based on a product space of Orthogonal Spheres (PrOSe), which improves disentanglement quality significantly compared to state-of-the-art methods across multiple benchmarks and metrics.

Learning representations that can disentangle explanatory attributes underlying the data improves interpretabilty as well as provides control on data generation. Various learning frameworks such as VAEs, GANs and auto-encoders have been used in the literature to learn such representations. Most often, the latent space is constrained to a partitioned representation or structured by a prior to impose disentangling. In this work, we advance the use of a latent representation based on a product space of Orthogonal Spheres PrOSe. The PrOSe model is motivated by the reasoning that latent-variables related to the physics of image-formation can under certain relaxed assumptions lead to spherical-spaces. Orthogonality between the spheres is motivated via physical independence models. Imposing the orthogonal-sphere constraint is much simpler than other complicated physical models, is fairly general and flexible, and extensible beyond the factors used to motivate its development. Under further relaxed assumptions of equal-sized latent blocks per factor, the constraint can be written down in closed form as an ortho-normality term in the loss function. We show that our approach improves the quality of disentanglement significantly. We find consistent improvement in disentanglement compared to several state-of-the-art approaches, across several benchmarks and metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes