CVGNMay 25, 2025

Scalable Generation of Spatial Transcriptomics from Histology Images via Whole-Slide Flow Matching

arXiv:2506.05361v115 citationsh-index: 21ICML
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for researchers in spatial transcriptomics by providing a scalable method to accelerate gene expression profiling from histology images, though it appears incremental as it builds on prior prediction efforts.

The paper tackles the problem of predicting spatial transcriptomics from whole-slide histology images by addressing limitations in modeling cell-cell interactions and memory constraints, resulting in STFlow achieving over 18% relative improvements over state-of-the-art baselines on benchmarks.

Spatial transcriptomics (ST) has emerged as a powerful technology for bridging histology imaging with gene expression profiling. However, its application has been limited by low throughput and the need for specialized experimental facilities. Prior works sought to predict ST from whole-slide histology images to accelerate this process, but they suffer from two major limitations. First, they do not explicitly model cell-cell interaction as they factorize the joint distribution of whole-slide ST data and predict the gene expression of each spot independently. Second, their encoders struggle with memory constraints due to the large number of spots (often exceeding 10,000) in typical ST datasets. Herein, we propose STFlow, a flow matching generative model that considers cell-cell interaction by modeling the joint distribution of gene expression of an entire slide. It also employs an efficient slide-level encoder with local spatial attention, enabling whole-slide processing without excessive memory overhead. On the recently curated HEST-1k and STImage-1K4M benchmarks, STFlow substantially outperforms state-of-the-art baselines and achieves over 18% relative improvements over the pathology foundation models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes