MLLGGNAPMay 6, 2024

Scalable Amortized GPLVMs for Single Cell Transcriptomics Data

arXiv:2405.03879v1
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable dimensionality reduction in single-cell transcriptomics, though it appears incremental as it matches rather than surpasses existing methods.

The authors tackled the problem of ineffective clustering in scalable Gaussian Process Latent Variable Models for single-cell RNA-seq data by introducing an amortized stochastic variational Bayesian GPLVM with specialized designs, achieving performance matching the leading scVI approach on synthetic and real-world COVID datasets.

Dimensionality reduction is crucial for analyzing large-scale single-cell RNA-seq data. Gaussian Process Latent Variable Models (GPLVMs) offer an interpretable dimensionality reduction method, but current scalable models lack effectiveness in clustering cell types. We introduce an improved model, the amortized stochastic variational Bayesian GPLVM (BGPLVM), tailored for single-cell RNA-seq with specialized encoder, kernel, and likelihood designs. This model matches the performance of the leading single-cell variational inference (scVI) approach on synthetic and real-world COVID datasets and effectively incorporates cell-cycle and batch information to reveal more interpretable latent structures as we demonstrate on an innate immunity dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes