RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction
This work enables automated tracking of disease progression in medical imaging, though it represents an incremental improvement over existing information extraction methods.
The authors tackled the problem of extracting disease progression information from radiology reports by creating RadGraph2, a hierarchical dataset, and developing HGIE, a modified DyGIE++ model that outperformed previous models in entity and relation extraction tasks.
We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for developing automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain.