CV LGAug 16, 2021

Improving Self-supervised Learning with Hardness-aware Dynamic Curriculum Learning: An Application to Digital Pathology

arXiv:2108.07183v29.426 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of domain shift and limited labeled data in digital pathology, offering an incremental improvement to SSL methods for medical image analysis.

The paper tackles the problem of improving self-supervised learning (SSL) representations for better generalization in digital pathology, particularly when labeled data is scarce or domain shifts occur, by proposing a hardness-aware dynamic curriculum learning (HaDCL) approach that enhances pretrained representations through progressive harder examples during fine-tuning, resulting in minimum AUC improvements of 1.7% on in-domain and 2.2% on out-of-domain data.

Self-supervised learning (SSL) has recently shown tremendous potential to learn generic visual representations useful for many image analysis tasks. Despite their notable success, the existing SSL methods fail to generalize to downstream tasks when the number of labeled training instances is small or if the domain shift between the transfer domains is significant. In this paper, we attempt to improve self-supervised pretrained representations through the lens of curriculum learning by proposing a hardness-aware dynamic curriculum learning (HaDCL) approach. To improve the robustness and generalizability of SSL, we dynamically leverage progressive harder examples via easy-to-hard and hard-to-very-hard samples during mini-batch downstream fine-tuning. We discover that by progressive stage-wise curriculum learning, the pretrained representations are significantly enhanced and adaptable to both in-domain and out-of-domain distribution data. We performed extensive validation on three histology benchmark datasets on both patch-wise and slide-level classification problems. Our curriculum based fine-tuning yields a significant improvement over standard fine-tuning, with a minimum improvement in area-under-the-curve (AUC) score of 1.7% and 2.2% on in-domain and out-of-domain distribution data, respectively. Further, we empirically show that our approach is more generic and adaptable to any SSL methods and does not impose any additional overhead complexity. Besides, we also outline the role of patch-based versus slide-based curriculum learning in histopathology to provide practical insights into the success of curriculum based fine-tuning of SSL methods. Code is released at https://github.com/srinidhiPY/ICCV-CDPATH2021-ID-8

View on arXiv PDF Code

Similar