Data Efficient Contrastive Learning in Histopathology using Active Sampling
This work addresses data efficiency for histopathology diagnostics, offering a domain-specific incremental improvement.
The paper tackled the problem of high data and time requirements in self-supervised learning for histopathology by proposing an active sampling method, which reduced sample needs by 93% and training time by 62% while maintaining performance.
Deep learning (DL) based diagnostics systems can provide accurate and robust quantitative analysis in digital pathology. These algorithms require large amounts of annotated training data which is impractical in pathology due to the high resolution of histopathological images. Hence, self-supervised methods have been proposed to learn features using ad-hoc pretext tasks. The self-supervised training process uses a large unlabeled dataset which makes the learning process time consuming. In this work, we propose a new method for actively sampling informative members from the training set using a small proxy network, decreasing sample requirement by 93% and training time by 62% while maintaining the same performance of the traditional self-supervised learning method. The code is available on https://github.com/Reasat/data_efficient_cl