Patch Stitching Data Augmentation for Cancer Classification in Pathology Images
This work addresses data limitations in cancer classification for pathology, though it is incremental as it builds on existing augmentation methods.
The paper tackles data scarcity and imbalance in computational pathology by introducing a data augmentation strategy that generates new pathology images from existing ones, achieving improved classification results on two colorectal cancer datasets.
Computational pathology, integrating computational methods and digital imaging, has shown to be effective in advancing disease diagnosis and prognosis. In recent years, the development of machine learning and deep learning has greatly bolstered the power of computational pathology. However, there still remains the issue of data scarcity and data imbalance, which can have an adversarial effect on any computational method. In this paper, we introduce an efficient and effective data augmentation strategy to generate new pathology images from the existing pathology images and thus enrich datasets without additional data collection or annotation costs. To evaluate the proposed method, we employed two sets of colorectal cancer datasets and obtained improved classification results, suggesting that the proposed simple approach holds the potential for alleviating the data scarcity and imbalance in computational pathology.