LGGNMLOct 13, 2017

A deep generative model for single-cell RNA sequencing with application to detecting differentially expressed genes

arXiv:1710.05086v27 citations
AI Analysis

This work addresses the challenge of interpreting noisy and high-dimensional gene expression data for researchers in genomics and computational biology, offering a scalable and flexible solution with practical improvements over existing tools.

The authors tackled the problem of analyzing single-cell RNA sequencing data by proposing a deep generative model that accounts for technical zeros and batch effects, achieving scalability to over one million cells and outperforming state-of-the-art methods like ZIFA, ZINB-WaVE, and DESeq2 in tasks such as differential expression detection.

We propose a probabilistic model for interpreting gene expression levels that are observed through single-cell RNA sequencing. In the model, each cell has a low-dimensional latent representation. Additional latent variables account for technical effects that may erroneously set some observations of gene expression levels to zero. Conditional distributions are specified by neural networks, giving the proposed model enough flexibility to fit the data well. We use variational inference and stochastic optimization to approximate the posterior distribution. The inference procedure scales to over one million cells, whereas competing algorithms do not. Even for smaller datasets, for several tasks, the proposed procedure outperforms state-of-the-art methods like ZIFA and ZINB-WaVE. We also extend our framework to take into account batch effects and other confounding factors and propose a natural Bayesian hypothesis framework for differential expression that outperforms tradition DESeq2.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes