CVApr 7, 2021

Where and What? Examining Interpretable Disentangled Representations

arXiv:2104.05622v150 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of interpretability in disentanglement learning for unsupervised machine learning applications, offering a novel proxy for improving representation quality.

The paper tackles the problem of learning interpretable disentangled representations in an unsupervised setting by proposing a method that localizes the effect of each latent dimension on generated images and enforces encoding of simple variations, achieving high-quality disentanglement on various datasets.

Capturing interpretable variations has long been one of the goals in disentanglement learning. However, unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting. In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted? A latent code is easily to be interpreted if it would consistently impact a certain subarea of the resulting generated image. We thus propose to learn a spatial mask to localize the effect of each individual latent dimension. On the other hand, interpretability usually comes from latent dimensions that capture simple and basic variations in data. We thus impose a perturbation on a certain dimension of the latent code, and expect to identify the perturbation along this dimension from the generated images so that the encoding of simple variations can be enforced. Additionally, we develop an unsupervised model selection method, which accumulates perceptual distance scores along axes in the latent space. On various datasets, our models can learn high-quality disentangled representations without supervision, showing the proposed modeling of interpretability is an effective proxy for achieving unsupervised disentanglement.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes