CVMay 17

Deep learning-based compression of giga-resolution whole slide images

Maren Høibø, Etienne Gaucher, Ingerid Reinertsen, Marit Valla, Erik Smistad

arXiv:2605.1766817.3

Predicted impact top 92% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For digital pathology, this work addresses storage challenges of large WSIs by demonstrating that deep learning compression can significantly reduce file sizes while maintaining high image quality, though with higher decompression times.

This study applied deep learning-based tissue segmentation and compression to reduce whole slide image (WSI) file sizes, achieving 44-80% total size reduction compared to JPEG, with deep learning compression alone reducing size by 43-72%. On tissue patches, deep learning models saved ~35-40% per patch with SSIM >0.95, outperforming JPEG-XL (17%) and JPEG-2000 (14%).

Implementation of digital pathology leads to an increased number of whole slide images (WSIs). The large size of WSIs is challenging. Today, WSIs are compressed with codecs like JPEG resulting in several gigabytes per WSI, and large amounts of space are wasted storing glass. In this study, deep learning-based tissue segmentation for glass removal, and deep learning compression methods were explored and compared with JPEG, JPEG-2000 and JPEG-XL. Image pyramids (N=21) with intact glass, glass replaced by single-colored pixels, and glass replaced by zero-byte tiles were created and compressed with JPEG, JPEG-XL and a deep learning model. Additionally, several compression models were evaluated on a tissue patch dataset and compared with JPEG, JPEG-2000 and JPEG-XL. Removing glass reduced file sizes considerably for JPEG and JPEG-XL. Deep learning-based image compression reduced the WSI size by 43-72% compared to JPEG compression, whereas deep learning-based glass removal reduced the WSI size by 0.3-33%, and 6-62% using only single-colored pixels and removing all-glass tiles, respectively. Combining the two gave a small improvement to a 44-80% total size reduction which indicates that deep learning-based image compression is able to efficiently compress glass tiles, whereas JPEG is not. On the tissue patch dataset, the best deep learning-based compression models saved on average ~35-40% per patch compared to JPEG, while keeping an average SSIM above 0.95, whereas JPEG-XL and JPEG-2000 saved 17% and 14%, respectively while keeping an SSIM of 0.96. However, the deep learning models had higher decompression times than JPEG and JPEG-XL.

View on arXiv PDF

Similar