CVDec 5, 2022

Med-Query: Steerable Parsing of 9-DoF Medical Anatomies with Query Embedding

Heng Guo, Jianfeng Zhang, Ke Yan, Le Lu, Minfeng Xu

arXiv:2212.02014v34.84 citationsh-index: 57Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for robust and efficient anatomy parsing in clinical applications, offering incremental improvements over existing methods.

The paper tackles the problem of automatic instance-level parsing of 3D medical anatomies in CT scans, which is challenging due to pathologies and limited field-of-view, by proposing a steerable framework for detection, identification, and segmentation, achieving high results such as 97.0% identification rate and 90.9% Dice score for rib parsing.

Automatic parsing of human anatomies at the instance-level from 3D computed tomography (CT) is a prerequisite step for many clinical applications. The presence of pathologies, broken structures or limited field-of-view (FOV) can all make anatomy parsing algorithms vulnerable. In this work, we explore how to leverage and implement the successful detection-then-segmentation paradigm for 3D medical data, and propose a steerable, robust, and efficient computing framework for detection, identification, and segmentation of anatomies in CT scans. Considering the complicated shapes, sizes, and orientations of anatomies, without loss of generality, we present a nine degrees of freedom (9-DoF) pose estimation solution in full 3D space using a novel single-stage, non-hierarchical representation. Our whole framework is executed in a steerable manner where any anatomy of interest can be directly retrieved to further boost inference efficiency. We have validated our method on three medical imaging parsing tasks: ribs, spine, and abdominal organs. For rib parsing, CT scans have been annotated at the rib instance-level for quantitative evaluation, similarly for spine vertebrae and abdominal organs. Extensive experiments on 9-DoF box detection and rib instance segmentation demonstrate the high efficiency and effectiveness of our framework (with the identification rate of 97.0% and the segmentation Dice score of 90.9%), compared favorably against several strong baselines (e.g., CenterNet, FCOS, and nnU-Net). For spine parsing and abdominal multi-organ segmentation, our method achieves competitive results on par with state-of-the-art methods on the public CTSpine1K dataset and FLARE22 competition, respectively. Our annotations, code, and models are available at: https://github.com/alibaba-damo-academy/Med_Query.

View on arXiv PDF Code

Similar