GNAILGJul 15, 2025

SToFM: a Multi-scale Foundation Model for Spatial Transcriptomics

arXiv:2507.11588v210 citationsh-index: 7ICML
Originality Highly original
AI Analysis

This work addresses the problem of analyzing complex spatial transcriptomics data for biologists, representing a novel method for a known bottleneck in the field.

The authors tackled the challenge of modeling Spatial Transcriptomics (ST) data by proposing SToFM, a multi-scale foundation model that integrates macro-scale tissue morphology, micro-scale cellular microenvironment, and gene-scale expression profiles, achieving outstanding performance on tasks like tissue region semantic segmentation and cell type annotation.

Spatial Transcriptomics (ST) technologies provide biologists with rich insights into single-cell biology by preserving spatial context of cells. Building foundational models for ST can significantly enhance the analysis of vast and complex data sources, unlocking new perspectives on the intricacies of biological tissues. However, modeling ST data is inherently challenging due to the need to extract multi-scale information from tissue slices containing vast numbers of cells. This process requires integrating macro-scale tissue morphology, micro-scale cellular microenvironment, and gene-scale gene expression profile. To address this challenge, we propose SToFM, a multi-scale Spatial Transcriptomics Foundation Model. SToFM first performs multi-scale information extraction on each ST slice, to construct a set of ST sub-slices that aggregate macro-, micro- and gene-scale information. Then an SE(2) Transformer is used to obtain high-quality cell representations from the sub-slices. Additionally, we construct \textbf{SToCorpus-88M}, the largest high-resolution spatial transcriptomics corpus for pretraining. SToFM achieves outstanding performance on a variety of downstream tasks, such as tissue region semantic segmentation and cell type annotation, demonstrating its comprehensive understanding of ST data through capturing and integrating multi-scale information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes