CVJul 16, 2024

cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process

Yihang Chen, Tsai Hor Chan, Guosheng Yin, Yuming Jiang, Lequan Yu

arXiv:2407.11448v28.78 citationsh-index: 6Has Code

Originality Highly original

AI Analysis

This work addresses a domain-specific problem in medical imaging for histopathology analysis, offering a novel method to enhance accuracy and robustness.

The paper tackles the problem of biased slide-level representations and overfitting in multiple instance learning for whole slide histopathology image analysis by proposing a Bayesian nonparametric framework using cascaded Dirichlet processes, resulting in superior performance validated on five benchmarks with improved generalizability and uncertainty estimation.

Multiple instance learning (MIL) has been extensively applied to whole slide histopathology image (WSI) analysis. The existing aggregation strategy in MIL, which primarily relies on the first-order distance (e.g., mean difference) between instances, fails to accurately approximate the true feature distribution of each instance, leading to biased slide-level representations. Moreover, the scarcity of WSI observations easily leads to model overfitting, resulting in unstable testing performance and limited generalizability. To tackle these challenges, we propose a new Bayesian nonparametric framework for multiple instance learning, which adopts a cascade of Dirichlet processes (cDP) to incorporate the instance-to-bag characteristic of the WSIs. We perform feature aggregation based on the latent clusters formed by the Dirichlet process, which incorporates the covariances of the patch features and forms more representative clusters. We then perform bag-level prediction with another Dirichlet process model on the bags, which imposes a natural regularization on learning to prevent overfitting and enhance generalizability. Moreover, as a Bayesian nonparametric method, the cDP model can accurately generate posterior uncertainty, which allows for the detection of outlier samples and tumor localization. Extensive experiments on five WSI benchmarks validate the superior performance of our method, as well as its generalizability and ability to estimate uncertainties. Codes are available at https://github.com/HKU-MedAI/cDPMIL.

View on arXiv PDF Code

Similar