LGGNMLSep 7, 2017

A deep generative model for gene expression profiles from single-cell RNA sequencing

arXiv:1709.02082v44 citations
Originality Incremental advance
AI Analysis

This work addresses challenges in analyzing single-cell RNA sequencing data for biomedical researchers, offering a scalable and flexible solution, though it appears incremental as it builds on existing probabilistic and variational inference frameworks.

The authors tackled the problem of modeling gene expression data from single-cell RNA sequencing by proposing a deep generative model that scales to over one million cells and outperforms state-of-the-art methods like ZIFA and ZINB-WaVE on several tasks, while also extending to handle batch effects and outperforming DESeq2 in differential expression testing.

We propose a probabilistic model for interpreting gene expression levels that are observed through single-cell RNA sequencing. In the model, each cell has a low-dimensional latent representation. Additional latent variables account for technical effects that may erroneously set some observations of gene expression levels to zero. Conditional distributions are specified by neural networks, giving the proposed model enough flexibility to fit the data well. We use variational inference and stochastic optimization to approximate the posterior distribution. The inference procedure scales to over one million cells, whereas competing algorithms do not. Even for smaller datasets, for several tasks, the proposed procedure outperforms state-of-the-art methods like ZIFA and ZINB-WaVE. We also extend our framework to account for batch effects and other confounding factors, and propose a Bayesian hypothesis test for differential expression that outperforms DESeq2.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes