CVNov 10, 2025

Learning from the Right Patches: A Two-Stage Wavelet-Driven Masked Autoencoder for Histopathology Representation Learning

arXiv:2511.06958v2h-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of learning meaningful representations from whole-slide images in digital pathology, which is incremental as it adapts an existing method with domain-specific improvements.

The paper tackled the problem of irrelevant or noisy regions in masked autoencoder pretraining for histopathology by introducing a wavelet-informed patch selection strategy, resulting in competitive representation quality and downstream classification performance across multiple cancer datasets while maintaining efficiency.

Whole-slide images are central to digital pathology, yet their extreme size and scarce annotations make self-supervised learning essential. Masked Autoencoders (MAEs) with Vision Transformer backbones have recently shown strong potential for histopathology representation learning. However, conventional random patch sampling during MAE pretraining often includes irrelevant or noisy regions, limiting the model's ability to capture meaningful tissue patterns. In this paper, we present a lightweight and domain-adapted framework that brings structure and biological relevance into MAE-based learning through a wavelet-informed patch selection strategy. WISE-MAE applies a two-step coarse-to-fine process: wavelet-based screening at low magnification to locate structurally rich regions, followed by high-resolution extraction for detailed modeling. This approach mirrors the diagnostic workflow of pathologists and improves the quality of learned representations. Evaluations across multiple cancer datasets, including lung, renal, and colorectal tissues, show that WISE-MAE achieves competitive representation quality and downstream classification performance while maintaining efficiency under weak supervision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes