LGMLFeb 25, 2021

Variational Selective Autoencoder: Learning from Partially-Observed Heterogeneous Data

arXiv:2102.12679v114 citations
Originality Highly original
AI Analysis

It addresses challenges in real-world applications where heterogeneous data often have missing values, offering a unified model for downstream tasks like imputation and generation.

The paper tackles the problem of learning from partially-observed heterogeneous data by proposing the Variational Selective Autoencoder (VSAE), which models latent dependencies to handle missingness and heterogeneity, resulting in improved performance for data generation and imputation tasks over state-of-the-art models.

Learning from heterogeneous data poses challenges such as combining data from various sources and of different types. Meanwhile, heterogeneous data are often associated with missingness in real-world applications due to heterogeneity and noise of input sources. In this work, we propose the variational selective autoencoder (VSAE), a general framework to learn representations from partially-observed heterogeneous data. VSAE learns the latent dependencies in heterogeneous data by modeling the joint distribution of observed data, unobserved data, and the imputation mask which represents how the data are missing. It results in a unified model for various downstream tasks including data generation and imputation. Evaluation on both low-dimensional and high-dimensional heterogeneous datasets for these two tasks shows improvement over state-of-the-art models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes