Inference of captions from histopathological patches
This work addresses the need for efficient diagnostic workflows in computational histopathology, though it is incremental as it builds on existing methods with a new dataset.
The study tackled the problem of automatically generating diagnostic reports from histopathological images by creating a dataset of 262K captioned patches for stomach adenocarcinoma and training a baseline attention-based model, achieving promising results.
Computational histopathology has made significant strides in the past few years, slowly getting closer to clinical adoption. One area of benefit would be the automatic generation of diagnostic reports from H\&E-stained whole slide images which would further increase the efficiency of the pathologists' routine diagnostic workflows. In this study, we compiled a dataset (PatchGastricADC22) of histopathological captions of stomach adenocarcinoma endoscopic biopsy specimens, which we extracted from diagnostic reports and paired with patches extracted from the associated whole slide images. The dataset contains a variety of gastric adenocarcinoma subtypes. We trained a baseline attention-based model to predict the captions from features extracted from the patches and obtained promising results. We make the captioned dataset of 262K patches publicly available.