QMLGGNJul 16, 2024

Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen

arXiv:2407.11734v216 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the need for more realistic and controllable generation of single-cell data in computational biology, offering incremental improvements over existing methods.

The paper tackles the problem of generative modeling for single-cell RNA-seq data by introducing CFGen, a flow-based model that preserves data discreteness, resulting in improved recovery of biological characteristics and effective handling of tasks like rare cell type augmentation and batch correction.

Generative modeling of single-cell RNA-seq data is crucial for tasks like trajectory inference, batch effect removal, and simulation of realistic cellular data. However, recent deep generative models simulating synthetic single cells from noise operate on pre-processed continuous gene expression approximations, overlooking the discrete nature of single-cell data, which limits their effectiveness and hinders the incorporation of robust noise models. Additionally, aspects like controllable multi-modal and multi-label generation of cellular data remain underexplored. This work introduces CellFlow for Generation (CFGen), a flow-based conditional generative model that preserves the inherent discreteness of single-cell data. CFGen generates whole-genome multi-modal single-cell data reliably, improving the recovery of crucial biological data characteristics while tackling relevant generative tasks such as rare cell type augmentation and batch correction. We also introduce a novel framework for compositional data generation using Flow Matching. By showcasing CFGen on a diverse set of biological datasets and settings, we provide evidence of its value to the fields of computational biology and deep generative models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes