LGCVMLDec 27, 2018

Classification of radiology reports by modality and anatomy: A comparative study

arXiv:1812.10818v1
Originality Synthesis-oriented
AI Analysis

This work addresses the time-consuming task of data labeling in radiology, which is crucial for accurate model predictions in research settings, though it is incremental as it applies existing methods to a specific domain.

The study tackled the problem of automating the classification of radiology reports by modality and anatomy, achieving high performance with a logistic regression classifier that reached average precision values above 0.9 on unseen datasets.

Data labeling is currently a time-consuming task that often requires expert knowledge. In research settings, the availability of correctly labeled data is crucial to ensure that model predictions are accurate and useful. We propose relatively simple machine learning-based models that achieve high performance metrics in the binary and multiclass classification of radiology reports. We compare the performance of these algorithms to that of a data-driven approach based on NLP, and find that the logistic regression classifier outperforms all other models, in both the binary and multiclass classification tasks. We then choose the logistic regression binary classifier to predict chest X-ray (CXR)/ non-chest X-ray (non-CXR) labels in reports from different datasets, unseen during any training phase of any of the models. Even in unseen report collections, the binary logistic regression classifier achieves average precision values of above 0.9. Based on the regression coefficient values, we also identify frequent tokens in CXR and non-CXR reports that are features with possibly high predictive power.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes