Yan Xi

IV
h-index12
7papers
560citations
Novelty52%
AI Score39

7 Papers

IVNov 15, 2022
DIGEST: Deeply supervIsed knowledGE tranSfer neTwork learning for brain tumor segmentation with incomplete multi-modal MRI scans

Haoran Li, Cheng Li, Weijian Huang et al.

Brain tumor segmentation based on multi-modal magnetic resonance imaging (MRI) plays a pivotal role in assisting brain cancer diagnosis, treatment, and postoperative evaluations. Despite the achieved inspiring performance by existing automatic segmentation methods, multi-modal MRI data are still unavailable in real-world clinical applications due to quite a few uncontrollable factors (e.g. different imaging protocols, data corruption, and patient condition limitations), which lead to a large performance drop during practical applications. In this work, we propose a Deeply supervIsed knowledGE tranSfer neTwork (DIGEST), which achieves accurate brain tumor segmentation under different modality-missing scenarios. Specifically, a knowledge transfer learning frame is constructed, enabling a student model to learn modality-shared semantic information from a teacher model pretrained with the complete multi-modal MRI data. To simulate all the possible modality-missing conditions under the given multi-modal data, we generate incomplete multi-modal MRI samples based on Bernoulli sampling. Finally, a deeply supervised knowledge transfer loss is designed to ensure the consistency of the teacher-student structure at different decoding stages, which helps the extraction of inherent and effective modality representations. Experiments on the BraTS 2020 dataset demonstrate that our method achieves promising results for the incomplete multi-modal MR image segmentation task.

IVFeb 5, 2024
Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

Jiarun Liu, Hao Yang, Hong-Yu Zhou et al.

Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks (CNNs) are constrained by their local receptive fields, and vision transformers (ViTs) suffer from high quadratic complexity of their attention mechanism. Recently, Mamba-based models have gained great attention for their impressive ability in long sequence modeling. Several studies have demonstrated that these models can outperform popular vision models in various tasks, offering higher accuracy, lower memory consumption, and less computational burden. However, existing Mamba-based models are mostly trained from scratch and do not explore the power of pretraining, which has been proven to be quite effective for data-efficient medical image analysis. This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks, leveraging the advantages of ImageNet-based pretraining. Our experimental results reveal the vital role of ImageNet-based training in enhancing the performance of Mamba-based models. Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models. Notably, on AbdomenMRI, Encoscopy, and Microscopy datasets, Swin-UMamba outperforms its closest counterpart U-Mamba_Enc by an average score of 2.72%.

IVMar 25, 2024Code
RSTAR4D: Rotational Streak Artifact Reduction in 4D CBCT using a Separable 4D CNN

Ziheng Deng, Hua Chen, Yongzheng Zhou et al.

Four-dimensional cone-beam computed tomography (4D CBCT) provides respiration-resolved images and can be used for image-guided radiation therapy. However, the ability to reveal respiratory motion comes at the cost of image artifacts. As raw projection data are sorted into multiple respiratory phases, the cone-beam projections become much sparser and the reconstructed 4D CBCT images will be covered by severe streak artifacts. Although several deep learning-based methods have been proposed to address this issue, most algorithms employ 2D network models as backbones, neglecting the intrinsic structural priors within 4D CBCT images. In this paper, we first explore the origin and appearance of streak artifacts in 4D CBCT images. We find that streak artifacts exhibit a unique rotational motion along with the patient's respiration, distinguishable from diaphragm-driven respiratory motion in the spatiotemporal domain. Therefore, we propose a novel 4D neural network model, RSTAR4D-Net, designed to address Rotational STreak Artifact Reduction by integrating the spatial and temporal information within 4D CBCT images. Specifically, we overcome the computational and training difficulties of a 4D neural network. The specially designed model adopts an efficient implementation of 4D convolutions to reduce computational costs and thus can process the whole 4D image in one pass. Additionally, a Tetris training strategy pertinent to the separable 4D convolutions is proposed to effectively train the model using limited 4D training samples. Extensive experiments substantiate the effectiveness of our proposed method, and the RSTAR4D-Net shows superior performance compared to other methods. The source code and dynamic demos are available at https://github.com/ivy9092111111/RSTAR.

CVAug 22, 2025
4D Virtual Imaging Platform for Dynamic Joint Assessment via Uni-Plane X-ray and 2D-3D Registration

Hao Tang, Rongxi Yi, Lei Li et al.

Conventional computed tomography (CT) lacks the ability to capture dynamic, weight-bearing joint motion. Functional evaluation, particularly after surgical intervention, requires four-dimensional (4D) imaging, but current methods are limited by excessive radiation exposure or incomplete spatial information from 2D techniques. We propose an integrated 4D joint analysis platform that combines: (1) a dual robotic arm cone-beam CT (CBCT) system with a programmable, gantry-free trajectory optimized for upright scanning; (2) a hybrid imaging pipeline that fuses static 3D CBCT with dynamic 2D X-rays using deep learning-based preprocessing, 3D-2D projection, and iterative optimization; and (3) a clinically validated framework for quantitative kinematic assessment. In simulation studies, the method achieved sub-voxel accuracy (0.235 mm) with a 99.18 percent success rate, outperforming conventional and state-of-the-art registration approaches. Clinical evaluation further demonstrated accurate quantification of tibial plateau motion and medial-lateral variance in post-total knee arthroplasty (TKA) patients. This 4D CBCT platform enables fast, accurate, and low-dose dynamic joint imaging, offering new opportunities for biomechanical research, precision diagnostics, and personalized orthopedic care.

IVFeb 3, 2022
PARCEL: Physics-based Unsupervised Contrastive Representation Learning for Multi-coil MR Imaging

Shanshan Wang, Ruoyou Wu, Cheng Li et al.

With the successful application of deep learning to magnetic resonance (MR) imaging, parallel imaging techniques based on neural networks have attracted wide attention. However, in the absence of high-quality, fully sampled datasets for training, the performance of these methods is limited. And the interpretability of models is not strong enough. To tackle this issue, this paper proposes a Physics-bAsed unsupeRvised Contrastive rEpresentation Learning (PARCEL) method to speed up parallel MR imaging. Specifically, PARCEL has a parallel framework to contrastively learn two branches of model-based unrolling networks from augmented undersampled multi-coil k-space data. A sophisticated co-training loss with three essential components has been designed to guide the two networks in capturing the inherent features and representations for MR images. And the final MR image is reconstructed with the trained contrastive networks. PARCEL was evaluated on two vivo datasets and compared to five state-of-the-art methods. The results show that PARCEL is able to learn essential representations for accurate MR reconstruction without relying on fully sampled datasets.

IVJan 8, 2022
Expert Knowledge-guided Geometric Representation Learning for Magnetic Resonance Imaging-based Glioma Grading

Yeqi Wang, Longfei Li, Cheng Li et al.

Radiomics and deep learning have shown high popularity in automatic glioma grading. Radiomics can extract hand-crafted features that quantitatively describe the expert knowledge of glioma grades, and deep learning is powerful in extracting a large number of high-throughput features that facilitate the final classification. However, the performance of existing methods can still be improved as their complementary strengths have not been sufficiently investigated and integrated. Furthermore, lesion maps are usually needed for the final prediction at the testing phase, which is very troublesome. In this paper, we propose an expert knowledge-guided geometric representation learning (ENROL) framework . Geometric manifolds of hand-crafted features and learned features are constructed to mine the implicit relationship between deep learning and radiomics, and therefore to dig mutual consent and essential representation for the glioma grades. With a specially designed manifold discrepancy measurement, the grading model can exploit the input image data and expert knowledge more effectively in the training phase and get rid of the requirement of lesion segmentation maps at the testing phase. The proposed framework is flexible regarding deep learning architectures to be utilized. Three different architectures have been evaluated and five models have been compared, which show that our framework can always generate promising results.

MED-PHMar 20, 2019
Convolutional Sparse Coding for Compressed Sensing CT Reconstruction

Peng Bao, Wenjun Xia, Kang Yang et al.

Over the past few years, dictionary learning (DL)-based methods have been successfully used in various image reconstruction problems. However, traditional DL-based computed tomography (CT) reconstruction methods are patch-based and ignore the consistency of pixels in overlapped patches. In addition, the features learned by these methods always contain shifted versions of the same features. In recent years, convolutional sparse coding (CSC) has been developed to address these problems. In this paper, inspired by several successful applications of CSC in the field of signal processing, we explore the potential of CSC in sparse-view CT reconstruction. By directly working on the whole image, without the necessity of dividing the image into overlapped patches in DL-based methods, the proposed methods can maintain more details and avoid artifacts caused by patch aggregation. With predetermined filters, an alternating scheme is developed to optimize the objective function. Extensive experiments with simulated and real CT data were performed to validate the effectiveness of the proposed methods. Qualitative and quantitative results demonstrate that the proposed methods achieve better performance than several existing state-of-the-art methods.