INFORM-CT: INtegrating LLMs and VLMs FOR Incidental Findings Management in Abdominal CT
For radiologists, this automates a time-consuming and variable task, potentially improving clinical workflow and consistency.
The paper introduces INFORM-CT, a framework combining LLMs and VLMs in a plan-and-execute agentic approach to automate detection, classification, and reporting of incidental findings in abdominal CT scans. On a benchmark for three organs, it outperforms pure VLM-based methods in accuracy and efficiency.
Incidental findings in CT scans, though often benign, can have significant clinical implications and should be reported following established guidelines. Traditional manual inspection by radiologists is time-consuming and variable. This paper proposes a novel framework that leverages large language models (LLMs) and foundational vision-language models (VLMs) in a plan-and-execute agentic approach to improve the efficiency and precision of incidental findings detection, classification, and reporting for abdominal CT scans. Given medical guidelines for abdominal organs, the process of managing incidental findings is automated through a planner-executor framework. The planner, based on LLM, generates Python scripts using predefined base functions, while the executor runs these scripts to perform the necessary checks and detections, via VLMs, segmentation models, and image processing subroutines. We demonstrate the effectiveness of our approach through experiments on a CT abdominal benchmark for three organs, in a fully automatic end-to-end manner. Our results show that the proposed framework outperforms existing pure VLM-based approaches in terms of accuracy and efficiency.