CVDec 13, 2020

Fully-Automated Liver Tumor Localization and Characterization from Multi-Phase MR Volumes Using Key-Slice ROI Parsing: A Physician-Inspired Approach

arXiv:2012.06964v34 citations
AI Analysis

This work addresses the critical challenge of robustly localizing diagnosable regions of interest in 3D MR volumes for liver tumor identification, providing a fully-automated computer-aided diagnosis solution for radiologists.

This paper presents a key-slice parser (KSP) that localizes diagnosable regions of interest (ROI) in multi-phase MR volumes for liver tumor identification. The KSP achieves high reliability, with 87% of patients having an average 3D overlap of >= 40% with ground truth, outperforming the best tested detector (79%). When combined with a classifier, it achieves an F1 score of 0.801 for hepatocellular carcinoma (HCC) vs. others, matching top human physicians.

Using radiological scans to identify liver tumors is crucial for proper patient treatment. This is highly challenging, as top radiologists only achieve F1 scores of roughly 80% (hepatocellular carcinoma (HCC) vs. others) with only moderate inter-rater agreement, even when using multi-phase magnetic resonance (MR) imagery. Thus, there is great impetus for computer-aided diagnosis (CAD) solutions. A critical challenge is to robustly parse a 3D MR volume to localize diagnosable regions of interest (ROI), especially for edge cases. In this paper, we break down this problem using a key-slice parser (KSP), which emulates physician workflows by first identifying key slices and then localizing their corresponding key ROIs. To achieve robustness, the KSP also uses curve-parsing and detection confidence re-weighting. We evaluate our approach on the largest multi-phase MR liver lesion test dataset to date (430 biopsy-confirmed patients). Experiments demonstrate that our KSP can localize diagnosable ROIs with high reliability: 87% patients have an average 3D overlap of >= 40% with the ground truth compared to only 79% using the best tested detector. When coupled with a classifier, we achieve an HCC vs. others F1 score of 0.801, providing a fully-automated CAD performance comparable to top human physicians.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes