PathVQA: 30000+ Questions for Medical Visual Question Answering
This dataset addresses the problem of limited medical VQA resources for AI researchers in pathology, enabling development of AI systems for diagnostic tasks, though it is incremental as it focuses on dataset creation rather than novel AI methods.
The authors tackled the challenge of creating a medical visual question answering dataset for pathology by developing PathVQA, a semi-automated pipeline that extracted images and captions from textbooks to generate 32,799 open-ended questions from 4,998 pathology images, which is the first such dataset for pathology VQA.
Is it possible to develop an "AI Pathologist" to pass the board-certified examination of the American Board of Pathology? To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer. Our work makes the first attempt to build such a dataset. Different from creating general-domain VQA datasets where the images are widely accessible and there are many crowdsourcing workers available and capable of generating question-answer pairs, developing a medical VQA dataset is much more challenging. First, due to privacy concerns, pathology images are usually not publicly available. Second, only well-trained pathologists can understand pathology images, but they barely have time to help create datasets for AI research. To address these challenges, we resort to pathology textbooks and online digital libraries. We develop a semi-automated pipeline to extract pathology images and captions from textbooks and generate question-answer pairs from captions using natural language processing. We collect 32,799 open-ended questions from 4,998 pathology images where each question is manually checked to ensure correctness. To our best knowledge, this is the first dataset for pathology VQA. Our dataset will be released publicly to promote research in medical VQA.