Leveraging Medical Visual Question Answering with Supporting Facts
This work addresses medical visual question answering for radiology, but it is incremental as it builds on existing methods like transfer and multi-task learning for a specific competition.
The paper tackled the ImageCLEF 2019 VQA-Med competition, which involved complex medical visual question answering tasks with diverse radiology images and a small, imbalanced dataset, by developing a Supporting Facts Network (SFN) that cross-utilized information from upstream tasks, resulting in an 18-point improvement in F-1 score on the validation set and a seventh-place ranking in the competition.
In this working notes paper, we describe IBM Research AI (Almaden) team's participation in the ImageCLEF 2019 VQA-Med competition. The challenge consists of four question-answering tasks based on radiology images. The diversity of imaging modalities, organs and disease types combined with a small imbalanced training set made this a highly complex problem. To overcome these difficulties, we implemented a modular pipeline architecture that utilized transfer learning and multi-task learning. Our findings led to the development of a novel model called Supporting Facts Network (SFN). The main idea behind SFN is to cross-utilize information from upstream tasks to improve the accuracy on harder downstream ones. This approach significantly improved the scores achieved in the validation set (18 point improvement in F-1 score). Finally, we submitted four runs to the competition and were ranked seventh.