IVCVMay 21, 2025

X-GRM: Large Gaussian Reconstruction Model for Sparse-view X-rays to Computed Tomography

arXiv:2505.15235v24 citationsh-index: 18Has Code
Originality Incremental advance
AI Analysis

This work addresses a domain-specific challenge in medical imaging by improving CT reconstruction from sparse data, which is incremental as it builds on existing methods with new architectural and representation components.

The paper tackles the problem of reconstructing 3D CT volumes from sparse-view 2D X-ray projections by introducing X-GRM, a large feedforward model with a novel Voxel-based Gaussian Splatting representation, resulting in high-quality reconstructions for both in-domain and out-domain inputs.

Computed Tomography serves as an indispensable tool in clinical workflows, providing non-invasive visualization of internal anatomical structures. Existing CT reconstruction works are limited to small-capacity model architecture and inflexible volume representation. In this work, we present X-GRM (X-ray Gaussian Reconstruction Model), a large feedforward model for reconstructing 3D CT volumes from sparse-view 2D X-ray projections. X-GRM employs a scalable transformer-based architecture to encode sparse-view X-ray inputs, where tokens from different views are integrated efficiently. Then, these tokens are decoded into a novel volume representation, named Voxel-based Gaussian Splatting (VoxGS), which enables efficient CT volume extraction and differentiable X-ray rendering. This combination of a high-capacity model and flexible volume representation, empowers our model to produce high-quality reconstructions from various testing inputs, including in-domain and out-domain X-ray projections. Our codes are available at: https://github.com/CUHK-AIM-Group/X-GRM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes