QMAIFeb 6

scDFM: Distributional Flow Matching Model for Robust Single-Cell Perturbation Prediction

arXiv:2602.07103v14 citationsh-index: 2Has Code
Originality Highly original
AI Analysis

This work addresses a key problem in systems biology and drug discovery for predicting cell perturbations, offering a robust solution with measurable improvements.

The paper tackles the challenge of predicting transcriptional responses to perturbations in single-cell data, which is noisy and sparse, by introducing scDFM, a generative model that outperforms prior methods, reducing mean squared error by 19.6% in combinatorial settings.

A central goal in systems biology and drug discovery is to predict the transcriptional response of cells to perturbations. This task is challenging due to the noisy and sparse nature of single-cell measurements, as well as the fact that perturbations often induce population-level shifts rather than changes in individual cells. Existing deep learning methods typically assume cell-level correspondences, limiting their ability to capture such global effects. We present scDFM, a generative framework based on conditional flow matching that models the full distribution of perturbed cells conditioned on control states. By incorporating a maximum mean discrepancy (MMD) objective, our method aligns perturbed and control populations beyond cell-level correspondences. To further improve robustness to sparsity and noise, we introduce the Perturbation-Aware Differential Transformer (PAD-Transformer), a backbone architecture that leverages gene interaction graphs and differential attention to capture context-specific expression changes. Across multiple genetic and drug perturbation benchmarks, scDFM consistently outperforms prior methods, demonstrating strong generalization in both unseen and combinatorial settings. In the combinatorial setting, it reduces mean squared error by 19.6% relative to the strongest baseline. These results highlight the importance of distribution-level generative modeling for robust in silico perturbation prediction. The code is available at https://github.com/AI4Science-WestlakeU/scDFM

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes