A Multi-scale Fused Graph Neural Network with Inter-view Contrastive Learning for Spatial Transcriptomics Data Clustering
This work addresses the problem of spatial domain clustering for researchers in bioinformatics and computational biology, representing an incremental advancement over existing methods by improving multi-scale feature fusion.
The paper tackled the challenge of identifying spatial domains in spatial transcriptomics data by proposing stMFG, a multi-scale interactive fusion graph network that integrates spatial and gene features with layer-wise cross-view attention and contrastive learning, achieving up to 14% ARI improvement on datasets like DLPFC and breast cancer.
Spatial transcriptomics enables genome-wide expression analysis within native tissue context, yet identifying spatial domains remains challenging due to complex gene-spatial interactions. Existing methods typically process spatial and feature views separately, fusing only at output level - an "encode-separately, fuse-late" paradigm that limits multi-scale semantic capture and cross-view interaction. Accordingly, stMFG is proposed, a multi-scale interactive fusion graph network that introduces layer-wise cross-view attention to dynamically integrate spatial and gene features after each convolution. The model combines cross-view contrastive learning with spatial constraints to enhance discriminability while maintaining spatial continuity. On DLPFC and breast cancer datasets, stMFG outperforms state-of-the-art methods, achieving up to 14% ARI improvement on certain slices.