Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs
This work addresses the gap in evaluation methods for radiology report generation models, providing tools to assess clinical applicability, though it is incremental as it builds on existing AI advancements.
The paper tackled the problem of evaluating radiology report generation models by introducing ReXKG, a system that constructs knowledge graphs from reports and proposes three metrics for comparison, revealing insights into AI models' capabilities and limitations compared to human-written reports.
Recent advancements in artificial intelligence have significantly improved the automatic generation of radiology reports. However, existing evaluation methods fail to reveal the models' understanding of radiological images and their capacity to achieve human-level granularity in descriptions. To bridge this gap, we introduce a system, named ReXKG, which extracts structured information from processed reports to construct a comprehensive radiology knowledge graph. We then propose three metrics to evaluate the similarity of nodes (ReXKG-NSC), distribution of edges (ReXKG-AMS), and coverage of subgraphs (ReXKG-SCS) across various knowledge graphs. We conduct an in-depth comparative analysis of AI-generated and human-written radiology reports, assessing the performance of both specialist and generalist models. Our study provides a deeper understanding of the capabilities and limitations of current AI models in radiology report generation, offering valuable insights for improving model performance and clinical applicability.