IV CVMay 7, 2025

Text2CT: Towards 3D CT Volume Generation from Free-text Descriptions Using Diffusion Model

Pengfei Guo, Can Zhao, Dong Yang, Yufan He, Vishwesh Nath, Ziyue Xu, Pedro R. A. S. Bassi, Zongwei Zhou, Benjamin D. Simon, Stephanie Anne Harmon, Baris Turkbey, Daguang Xu

arXiv:2505.04522v123.914 citationsh-index: 89

Originality Incremental advance

AI Analysis

This work addresses a transformative opportunity in diagnostics and research by enabling 3D CT generation from free-text, though it appears incremental as it builds on existing diffusion models with a novel prompt formulation.

The paper tackles the problem of generating 3D CT volumes from free-text descriptions, introducing Text2CT, a diffusion model-based approach that achieves state-of-the-art results in preserving anatomical fidelity and capturing intricate structures from diverse textual inputs.

Generating 3D CT volumes from descriptive free-text inputs presents a transformative opportunity in diagnostics and research. In this paper, we introduce Text2CT, a novel approach for synthesizing 3D CT volumes from textual descriptions using the diffusion model. Unlike previous methods that rely on fixed-format text input, Text2CT employs a novel prompt formulation that enables generation from diverse, free-text descriptions. The proposed framework encodes medical text into latent representations and decodes them into high-resolution 3D CT scans, effectively bridging the gap between semantic text inputs and detailed volumetric representations in a unified 3D framework. Our method demonstrates superior performance in preserving anatomical fidelity and capturing intricate structures as described in the input text. Extensive evaluations show that our approach achieves state-of-the-art results, offering promising potential applications in diagnostics, and data augmentation.

View on arXiv PDF

Similar