CLCELGCBOct 9, 2025

Large Language Models Meet Virtual Cell: A Survey

arXiv:2510.07706v1h-index: 8
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive review for researchers in computational biology, but it is incremental as it synthesizes existing work without new experimental results.

This survey tackles the integration of large language models (LLMs) into virtual cell modeling in cellular biology, proposing a unified taxonomy and reviewing core tasks, models, datasets, and challenges.

Large language models (LLMs) are transforming cellular biology by enabling the development of "virtual cells"--computational systems that represent, predict, and reason about cellular states and behaviors. This work provides a comprehensive review of LLMs for virtual cell modeling. We propose a unified taxonomy that organizes existing methods into two paradigms: LLMs as Oracles, for direct cellular modeling, and LLMs as Agents, for orchestrating complex scientific tasks. We identify three core tasks--cellular representation, perturbation prediction, and gene regulation inference--and review their associated models, datasets, evaluation benchmarks, as well as the critical challenges in scalability, generalizability, and interpretability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes