CVFeb 24

Momentum Memory for Knowledge Distillation in Computational Pathology

arXiv:2602.21395v14 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses a critical bottleneck in clinical translation of multimodal cancer diagnosis by enabling accurate histology-only inference, though it is incremental as it builds on existing knowledge distillation methods.

The paper tackled the problem of limited paired histology-genomics data in computational pathology by proposing Momentum Memory Knowledge Distillation (MoMKD), which improved performance and generalization in cancer diagnosis tasks, such as achieving strong results on TCGA-BRCA benchmarks for HER2, PR, and ODX classification.

Multimodal learning that integrates genomics and histopathology has shown strong potential in cancer diagnosis, yet its clinical translation is hindered by the limited availability of paired histology-genomics data. Knowledge distillation (KD) offers a practical solution by transferring genomic supervision into histopathology models, enabling accurate inference using histology alone. However, existing KD methods rely on batch-local alignment, which introduces instability due to limited within-batch comparisons and ultimately degrades performance. To address these limitations, we propose Momentum Memory Knowledge Distillation (MoMKD), a cross-modal distillation framework driven by a momentum-updated memory. This memory aggregates genomic and histopathology information across batches, effectively enlarging the supervisory context available to each mini-batch. Furthermore, we decouple the gradients of the genomics and histology branches, preventing genomic signals from dominating histology feature learning during training and eliminating the modality-gap issue at inference time. Extensive experiments on the TCGA-BRCA benchmark (HER2, PR, and ODX classification tasks) and an independent in-house testing dataset demonstrate that MoMKD consistently outperforms state-of-the-art MIL and multimodal KD baselines, delivering strong performance and generalization under histology-only inference. Overall, MoMKD establishes a robust and generalizable knowledge distillation paradigm for computational pathology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes