GNLGJul 1, 2025

Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations

arXiv:2507.02980v12 citationsh-index: 116
Originality Incremental advance
AI Analysis

This work addresses the need for more realistic models of cellular responses in early-stage drug discovery, offering incremental improvements by incorporating distributional shifts and prior knowledge from LLMs.

The paper tackled the problem of predicting distributional responses in gene expression after genetic perturbations, which is crucial for drug discovery, by developing a neural network that models expression distributions instead of just mean changes. The result was a model that outperformed baselines in capturing higher-order statistics like variance, skewness, and kurtosis at reduced training cost while remaining competitive in mean prediction.

We train a neural network to predict distributional responses in gene expression following genetic perturbations. This is an essential task in early-stage drug discovery, where such responses can offer insights into gene function and inform target identification. Existing methods only predict changes in the mean expression, overlooking stochasticity inherent in single-cell data. In contrast, we offer a more realistic view of cellular responses by modeling expression distributions. Our model predicts gene-level histograms conditioned on perturbations and outperforms baselines in capturing higher-order statistics, such as variance, skewness, and kurtosis, at a fraction of the training cost. To generalize to unseen perturbations, we incorporate prior knowledge via gene embeddings from large language models (LLMs). While modeling a richer output space, the method remains competitive in predicting mean expression changes. This work offers a practical step towards more expressive and biologically informative models of perturbation effects.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes