RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency
This work addresses a domain-specific problem in computational pathology by enhancing cross-modal alignment for researchers and clinicians, though it appears incremental as it builds on existing alignment methods.
The paper tackled the challenge of aligning spatial transcriptomics data with histology images by proposing a framework that uses ranking-based alignment loss and self-supervised knowledge distillation, resulting in improved alignment and predictive performance across seven public datasets.
Spatial transcriptomics (ST) provides essential spatial context by mapping gene expression within tissue, enabling detailed study of cellular heterogeneity and tissue organization. However, aligning ST data with histology images poses challenges due to inherent spatial distortions and modality-specific variations. Existing methods largely rely on direct alignment, which often fails to capture complex cross-modal relationships. To address these limitations, we propose a novel framework that aligns gene and image features using a ranking-based alignment loss, preserving relative similarity across modalities and enabling robust multi-scale alignment. To further enhance the alignment's stability, we employ self-supervised knowledge distillation with a teacher-student network architecture, effectively mitigating disruptions from high dimensionality, sparsity, and noise in gene expression data. Extensive experiments on seven public datasets that encompass gene expression prediction, slide-level classification, and survival analysis demonstrate the efficacy of our method, showing improved alignment and predictive performance over existing methods.