CVSep 14, 2022
SCULPTOR: Skeleton-Consistent Face Creation Using a Learned Parametric GeneratorZesong Qiu, Yuwei Li, Dongming He et al.
Recent years have seen growing interest in 3D human faces modelling due to its wide applications in digital human, character generation and animation. Existing approaches overwhelmingly emphasized on modeling the exterior shapes, textures and skin properties of faces, ignoring the inherent correlation between inner skeletal structures and appearance. In this paper, we present SCULPTOR, 3D face creations with Skeleton Consistency Using a Learned Parametric facial generaTOR, aiming to facilitate easy creation of both anatomically correct and visually convincing face models via a hybrid parametric-physical representation. At the core of SCULPTOR is LUCY, the first large-scale shape-skeleton face dataset in collaboration with plastic surgeons. Named after the fossils of one of the oldest known human ancestors, our LUCY dataset contains high-quality Computed Tomography (CT) scans of the complete human head before and after orthognathic surgeries, critical for evaluating surgery results. LUCY consists of 144 scans of 72 subjects (31 male and 41 female) where each subject has two CT scans taken pre- and post-orthognathic operations. Based on our LUCY dataset, we learn a novel skeleton consistent parametric facial generator, SCULPTOR, which can create the unique and nuanced facial features that help define a character and at the same time maintain physiological soundness. Our SCULPTOR jointly models the skull, face geometry and face appearance under a unified data-driven framework, by separating the depiction of a 3D face into shape blend shape, pose blend shape and facial expression blend shape. SCULPTOR preserves both anatomic correctness and visual realism in facial generation tasks compared with existing methods. Finally, we showcase the robustness and effectiveness of SCULPTOR in various fancy applications unseen before.
CVMay 11, 2024
Solving Energy-Independent Density for CT Metal Artifact Reduction via Neural RepresentationQing Wu, Xu Guo, Lixuan Chen et al.
X-ray CT often suffers from shadowing and streaking artifacts in the presence of metallic materials, which severely degrade imaging quality. Physically, the linear attenuation coefficients (LACs) of metals vary significantly with X-ray energy, causing a nonlinear beam hardening effect (BHE) in CT measurements. Reconstructing CT images from metal-corrupted measurements consequently becomes a challenging nonlinear inverse problem. Existing state-of-the-art (SOTA) metal artifact reduction (MAR) algorithms rely on supervised learning with numerous paired CT samples. While promising, these supervised methods often assume that the unknown LACs are energy-independent, ignoring the energy-induced BHE, which results in limited generalization. Moreover, the requirement for large datasets also limits their applications in real-world scenarios. In this work, we propose Density neural representation (Diner), a novel unsupervised MAR method. Our key innovation lies in formulating MAR as an energy-independent density reconstruction problem that strictly adheres to the photon-tissue absorption physical model. This model is inherently nonlinear and complex, making it a rarely considered approach in inverse imaging problems. By introducing the water-equivalent tissues approximation and a new polychromatic model to characterize the nonlinear CT acquisition process, we directly learn the neural representation of the density map from raw measurements without using external training data. This energy-independent density reconstruction framework fundamentally resolves the nonlinear BHE, enabling superior MAR performance across a wide range of scanning scenarios. Extensive experiments on both simulated and real-world datasets demonstrate the superiority of our unsupervised Diner over popular supervised methods in terms of MAR performance and robustness.
CVOct 16, 2025
Towards Generalist Intelligence in Dentistry: Vision Foundation Models for Oral and Maxillofacial RadiologyXinrui Huang, Fan Xiao, Dongming He et al.
Oral and maxillofacial radiology plays a vital role in dental healthcare, but radiographic image interpretation is limited by a shortage of trained professionals. While AI approaches have shown promise, existing dental AI systems are restricted by their single-modality focus, task-specific design, and reliance on costly labeled data, hindering their generalization across diverse clinical scenarios. To address these challenges, we introduce DentVFM, the first family of vision foundation models (VFMs) designed for dentistry. DentVFM generates task-agnostic visual representations for a wide range of dental applications and uses self-supervised learning on DentVista, a large curated dental imaging dataset with approximately 1.6 million multi-modal radiographic images from various medical centers. DentVFM includes 2D and 3D variants based on the Vision Transformer (ViT) architecture. To address gaps in dental intelligence assessment and benchmarks, we introduce DentBench, a comprehensive benchmark covering eight dental subspecialties, more diseases, imaging modalities, and a wide geographical distribution. DentVFM shows impressive generalist intelligence, demonstrating robust generalization to diverse dental tasks, such as disease diagnosis, treatment analysis, biomarker identification, and anatomical landmark detection and segmentation. Experimental results indicate DentVFM significantly outperforms supervised, self-supervised, and weakly supervised baselines, offering superior generalization, label efficiency, and scalability. Additionally, DentVFM enables cross-modality diagnostics, providing more reliable results than experienced dentists in situations where conventional imaging is unavailable. DentVFM sets a new paradigm for dental AI, offering a scalable, adaptable, and label-efficient model to improve intelligent dental healthcare and address critical gaps in global oral healthcare.
IVJul 7, 2025
SPIDER: Structure-Preferential Implicit Deep Network for Biplanar X-ray ReconstructionTianqi Yu, Xuanyu Tian, Jiawen Yang et al.
Biplanar X-ray imaging is widely used in health screening, postoperative rehabilitation evaluation of orthopedic diseases, and injury surgery due to its rapid acquisition, low radiation dose, and straightforward setup. However, 3D volume reconstruction from only two orthogonal projections represents a profoundly ill-posed inverse problem, owing to the intrinsic lack of depth information and irreducible ambiguities in soft-tissue visualization. Some existing methods can reconstruct skeletal structures and Computed Tomography (CT) volumes, they often yield incomplete bone geometry, imprecise tissue boundaries, and a lack of anatomical realism, thereby limiting their clinical utility in scenarios such as surgical planning and postoperative assessment. In this study, we introduce SPIDER, a novel supervised framework designed to reconstruct CT volumes from biplanar X-ray images. SPIDER incorporates tissue structure as prior (e.g., anatomical segmentation) into an implicit neural representation decoder in the form of joint supervision through a unified encoder-decoder architecture. This design enables the model to jointly learn image intensities and anatomical structures in a pixel-aligned fashion. To address the challenges posed by sparse input and structural ambiguity, SPIDER directly embeds anatomical constraints into the reconstruction process, thereby enhancing structural continuity and reducing soft-tissue artifacts. We conduct comprehensive experiments on clinical head CT datasets and show that SPIDER generates anatomically accurate reconstructions from only two projections. Furthermore, our approach demonstrates strong potential in downstream segmentation tasks, underscoring its utility in personalized treatment planning and image-guided surgical navigation.