M2ORT: Many-To-One Regression Transformer for Spatial Transcriptomics Prediction from Histopathology Images
This work addresses the high acquisition cost of ST data for cancer research by enabling prediction from more accessible histopathology images, though it is incremental as it builds on existing regression methods with a novel architectural adaptation.
The authors tackled the problem of predicting expensive Spatial Transcriptomics (ST) gene expressions from digital pathology images by proposing M2ORT, a many-to-one regression Transformer that accommodates multi-scale hierarchical image structures, achieving state-of-the-art performance with fewer parameters and FLOPs on three public datasets.
The advancement of Spatial Transcriptomics (ST) has facilitated the spatially-aware profiling of gene expressions based on histopathology images. Although ST data offers valuable insights into the micro-environment of tumors, its acquisition cost remains expensive. Therefore, directly predicting the ST expressions from digital pathology images is desired. Current methods usually adopt existing regression backbones for this task, which ignore the inherent multi-scale hierarchical data structure of digital pathology images. To address this limit, we propose M2ORT, a many-to-one regression Transformer that can accommodate the hierarchical structure of the pathology images through a decoupled multi-scale feature extractor. Different from traditional models that are trained with one-to-one image-label pairs, M2ORT accepts multiple pathology images of different magnifications at a time to jointly predict the gene expressions at their corresponding common ST spot, aiming at learning a many-to-one relationship through training. We have tested M2ORT on three public ST datasets and the experimental results show that M2ORT can achieve state-of-the-art performance with fewer parameters and floating-point operations (FLOPs). The code is available at: https://github.com/Dootmaan/M2ORT/.