MLLGGNMar 6, 2020

BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders

arXiv:2003.03462v12 citations
AI Analysis

This addresses the need for more interpretable and efficient analysis in domains such as genomics, though it is an incremental improvement over existing VAE methods.

The authors tackled the problem of needing separate steps for dimensionality reduction and feature clustering in high-dimensional tabular data like genomics by proposing BasisVAE, a joint modeling framework that integrates these tasks, achieving scalable inference as demonstrated on single-cell gene expression data.

Variational Autoencoders (VAEs) provide a flexible and scalable framework for non-linear dimensionality reduction. However, in application domains such as genomics where data sets are typically tabular and high-dimensional, a black-box approach to dimensionality reduction does not provide sufficient insights. Common data analysis workflows additionally use clustering techniques to identify groups of similar features. This usually leads to a two-stage process, however, it would be desirable to construct a joint modelling framework for simultaneous dimensionality reduction and clustering of features. In this paper, we propose to achieve this through the BasisVAE: a combination of the VAE and a probabilistic clustering prior, which lets us learn a one-hot basis function representation as part of the decoder network. Furthermore, for scenarios where not all features are aligned, we develop an extension to handle translation-invariant basis functions. We show how a collapsed variational inference scheme leads to scalable and efficient inference for BasisVAE, demonstrated on various toy examples as well as on single-cell gene expression data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes