Spatial Transcriptomics Analysis of Zero-shot Gene Expression Prediction
This work addresses a limitation in spatial transcriptomics analysis for biomedical researchers by enabling zero-shot prediction, though it is incremental as it builds on existing methods with a novel adaptation.
The paper tackles the problem of predicting gene expression from spatial transcriptomics slide images for unseen gene types, proposing a zero-shot framework that uses semantic embeddings from a large language model to achieve competitive performance compared to supervised methods.
Spatial transcriptomics (ST) captures gene expression within distinct regions (i.e., windows) of a tissue slide. Traditional supervised learning frameworks applied to model ST are constrained to predicting expression from slide image windows for gene types seen during training, failing to generalize to unseen gene types. To overcome this limitation, we propose a semantic guided network (SGN), a pioneering zero-shot framework for predicting gene expression from slide image windows. Considering a gene type can be described by functionality and phenotype, we dynamically embed a gene type to a vector per its functionality and phenotype, and employ this vector to project slide image windows to gene expression in feature space, unleashing zero-shot expression prediction for unseen gene types. The gene type functionality and phenotype are queried with a carefully designed prompt from a pre-trained large language model (LLM). On standard benchmark datasets, we demonstrate competitive zero-shot performance compared to past state-of-the-art supervised learning approaches.