Diagnostic Captioning: A Survey
This is an incremental survey that organizes existing knowledge for researchers and practitioners in medical AI, without introducing new methods or results.
The paper provides a comprehensive survey of Diagnostic Captioning (DC), which automates the generation of diagnostic text from medical images to assist physicians and reduce errors, summarizing datasets, evaluation measures, and current systems.
Diagnostic Captioning (DC) concerns the automatic generation of a diagnostic text from a set of medical images of a patient collected during an examination. DC can assist inexperienced physicians, reducing clinical errors. It can also help experienced physicians produce diagnostic reports faster. Following the advances of deep learning, especially in generic image captioning, DC has recently attracted more attention, leading to several systems and datasets. This article is an extensive overview of DC. It presents relevant datasets, evaluation measures, and up to date systems. It also highlights shortcomings that hinder DC's progress and proposes future directions.