Inter-Species Cell Detection: Datasets on pulmonary hemosiderophages in equine, human and feline specimens
This work addresses the need for high-quality annotated datasets for inter-species cell detection in pulmonary hemorrhage diagnosis, though it is incremental as it builds on existing annotation and deep learning methods.
The researchers tackled the problem of detecting pulmonary hemosiderophages across species by introducing a novel multi-species dataset of 74 cytology whole slide images, resulting in 297,383 annotated cells classified into five grades, making it one of the largest publicly available datasets in this domain.
Pulmonary hemorrhage (P-Hem) occurs among multiple species and can have various causes. Cytology of bronchoalveolarlavage fluid (BALF) using a 5-tier scoring system of alveolar macrophages based on their hemosiderin content is considered the most sensitive diagnostic method. We introduce a novel, fully annotated multi-species P-Hem dataset which consists of 74 cytology whole slide images (WSIs) with equine, feline and human samples. To create this high-quality and high-quantity dataset, we developed an annotation pipeline combining human expertise with deep learning and data visualisation techniques. We applied a deep learning-based object detection approach trained on 17 expertly annotated equine WSIs, to the remaining 39 equine, 12 human and 7 feline WSIs. The resulting annotations were semi-automatically screened for errors on multiple types of specialised annotation maps and finally reviewed by a trained pathologists. Our dataset contains a total of 297,383 hemosiderophages classified into five grades. It is one of the largest publicly availableWSIs datasets with respect to the number of annotations, the scanned area and the number of species covered.