Distilling foundation models for robust and efficient models in digital pathology
This work addresses the problem of high computational costs for researchers and practitioners in digital pathology, offering a lightweight and robust solution, though it is incremental as it applies existing distillation techniques to a specific domain.
The authors tackled the computational inefficiency of large foundation models in digital pathology by distilling them into a smaller model, achieving nearly comparable performance with significantly reduced inference cost, including 3rd place on the HEST benchmark and 5th on the EVA benchmark.
In recent years, the advent of foundation models (FM) for digital pathology has relied heavily on scaling the pre-training datasets and the model size, yielding large and powerful models. While it resulted in improving the performance on diverse downstream tasks, it also introduced increased computational cost and inference time. In this work, we explore the distillation of a large foundation model into a smaller one, reducing the number of parameters by several orders of magnitude. Leveraging distillation techniques, our distilled model, H0-mini, achieves nearly comparable performance to large FMs at a significantly reduced inference cost. It is evaluated on several public benchmarks, achieving 3rd place on the HEST benchmark and 5th place on the EVA benchmark. Additionally, a robustness analysis conducted on the PLISM dataset demonstrates that our distilled model reaches excellent robustness to variations in staining and scanning conditions, significantly outperforming other state-of-the art models. This opens new perspectives to design lightweight and robust models for digital pathology, without compromising on performance.