LGMLJun 24, 2018

Disentangled VAE Representations for Multi-Aspect and Missing Data

arXiv:1806.09060v111 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of handling incomplete multi-view or multi-modal data for applications in machine learning, though it appears incremental as it builds on existing VAE frameworks.

The paper tackled the problem of conditional modeling and sampling in multi-aspect data with missing observations by developing factVAE, a deep generative model that demonstrated effectiveness on real-world datasets like motion capture poses and facial images.

Many problems in machine learning and related application areas are fundamentally variants of conditional modeling and sampling across multi-aspect data, either multi-view, multi-modal, or simply multi-group. For example, sampling from the distribution of English sentences conditioned on a given French sentence or sampling audio waveforms conditioned on a given piece of text. Central to many of these problems is the issue of missing data: we can observe many English, French, or German sentences individually but only occasionally do we have data for a sentence pair. Motivated by these applications and inspired by recent progress in variational autoencoders for grouped data, we develop factVAE, a deep generative model capable of handling multi-aspect data, robust to missing observations, and with a prior that encourages disentanglement between the groups and the latent dimensions. The effectiveness of factVAE is demonstrated on a variety of rich real-world datasets, including motion capture poses and pictures of faces captured from varying poses and perspectives.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes