Generative Models for Reproducible Coronary Calcium Scoring
This work addresses the problem of unreliable coronary heart disease risk categorization for patients, particularly in non-ECG-synchronized CT scans, by improving reproducibility, though it is incremental as it builds on existing generative models.
The paper tackled the limited interscan reproducibility of coronary artery calcium (CAC) scoring in CT scans by proposing a generative adversarial network (CycleGAN) method that decomposes images without requiring a fixed threshold, achieving a lower relative interscan difference in CAC mass of 47% compared to 89% for manual clinical scoring.
Purpose: Coronary artery calcium (CAC) score, i.e. the amount of CAC quantified in CT, is a strong and independent predictor of coronary heart disease (CHD) events. However, CAC scoring suffers from limited interscan reproducibility, which is mainly due to the clinical definition requiring application of a fixed intensity level threshold for segmentation of calcifications. This limitation is especially pronounced in non-ECG-synchronized CT where lesions are more impacted by cardiac motion and partial volume effects. Therefore, we propose a CAC quantification method that does not require a threshold for segmentation of CAC. Approach: Our method utilizes a generative adversarial network where a CT with CAC is decomposed into an image without CAC and an image showing only CAC. The method, using a CycleGAN, was trained using 626 low-dose chest CTs and 514 radiotherapy treatment planning CTs. Interscan reproducibility was compared to clinical calcium scoring in radiotherapy treatment planning CTs of 1,662 patients, each having two scans. Results: A lower relative interscan difference in CAC mass was achieved by the proposed method: 47% compared to 89% manual clinical calcium scoring. The intraclass correlation coefficient of Agatston scores was 0.96 for the proposed method compared to 0.91 for automatic clinical calcium scoring. Conclusions: The increased interscan reproducibility achieved by our method may lead to increased reliability of CHD risk categorization and improved accuracy of CHD event prediction.