CVOct 16, 2025

Fourier Transform Multiple Instance Learning for Whole Slide Image Classification

arXiv:2510.15138v21 citationsh-index: 3J med imaging
Originality Incremental advance
AI Analysis

This work addresses the limitation of existing Multiple Instance Learning methods in computational pathology for robust diagnostic prediction by enhancing global context modeling, representing an incremental advancement.

The paper tackled the problem of capturing global dependencies in Whole Slide Image classification by proposing Fourier Transform Multiple Instance Learning (FFT-MIL), which integrates frequency-domain features with spatial patch features, resulting in average improvements of 3.51% in macro F1 scores and 1.51% in AUC across multiple datasets and methods.

Whole Slide Image (WSI) classification relies on Multiple Instance Learning (MIL) with spatial patch features, yet existing methods struggle to capture global dependencies due to the immense size of WSIs and the local nature of patch embeddings. This limitation hinders the modeling of coarse structures essential for robust diagnostic prediction. We propose Fourier Transform Multiple Instance Learning (FFT-MIL), a framework that augments MIL with a frequency-domain branch to provide compact global context. Low-frequency crops are extracted from WSIs via the Fast Fourier Transform and processed through a modular FFT-Block composed of convolutional layers and Min-Max normalization to mitigate the high variance of frequency data. The learned global frequency feature is fused with spatial patch features through lightweight integration strategies, enabling compatibility with diverse MIL architectures. FFT-MIL was evaluated across six state-of-the-art MIL methods on three public datasets (BRACS, LUAD, and IMP). Integration of the FFT-Block improved macro F1 scores by an average of 3.51% and AUC by 1.51%, demonstrating consistent gains across architectures and datasets. These results establish frequency-domain learning as an effective and efficient mechanism for capturing global dependencies in WSI classification, complementing spatial features and advancing the scalability and accuracy of MIL-based computational pathology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes