Zongyu Li

h-index3

3papers

13citations

Novelty47%

AI Score33

Ranked #121,617 of 194,257 authors (top 63%)#1,907 in IV (top 43%)

3 Papers

10.3IVSep 12, 2024Code

MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation

Aaron Cao, Zongyu Li, Jordan Jomsky et al.

Widely used traditional pipelines for subcortical brain segmentation are often inefficient and slow, particularly when processing large datasets. Furthermore, deep learning models face challenges due to the high resolution of MRI images and the large number of anatomical classes involved. To address these limitations, we developed a 3D patch-based hybrid CNN-Mamba model that leverages Mamba's selective scan algorithm, thereby enhancing segmentation accuracy and efficiency for 3D inputs. This retrospective study utilized 1784 T1-weighted MRI scans from a diverse, multi-site dataset of healthy individuals. The dataset was divided into training, validation, and testing sets with a 1076/345/363 split. The scans were obtained from 1.5T and 3T MRI machines. Our model's performance was validated against several benchmarks, including other CNN-Mamba, CNN-Transformer, and pure CNN networks, using FreeSurfer-generated ground truths. We employed the Dice Similarity Coefficient (DSC), Volume Similarity (VS), and Average Symmetric Surface Distance (ASSD) as evaluation metrics. Statistical significance was determined using the Wilcoxon signed-rank test with a threshold of P < 0.05. The proposed model achieved the highest overall performance across all metrics (DSC 0.88383; VS 0.97076; ASSD 0.33604), significantly outperforming all non-Mamba-based models (P < 0.001). While the model did not show significant improvement in DSC or VS compared to another Mamba-based model (P-values of 0.114 and 0.425), it demonstrated a significant enhancement in ASSD (P < 0.001) with approximately 20% fewer parameters. In conclusion, our proposed hybrid CNN-Mamba architecture offers an efficient and accurate approach for 3D subcortical brain segmentation, demonstrating potential advantages over existing methods. Code is available at: https://github.com/aaroncao06/MedSegMamba.

3.9CVFeb 28, 2023Code

Towards Surgical Context Inference and Translation to Gestures

Kay Hutchinson, Zongyu Li, Ian Reyes et al.

Manual labeling of gestures in robot-assisted surgery is labor intensive, prone to errors, and requires expertise or training. We propose a method for automated and explainable generation of gesture transcripts that leverages the abundance of data for image segmentation. Surgical context is detected using segmentation masks by examining the distances and intersections between the tools and objects. Next, context labels are translated into gesture transcripts using knowledge-based Finite State Machine (FSM) and data-driven Long Short Term Memory (LSTM) models. We evaluate the performance of each stage of our method by comparing the results with the ground truth segmentation masks, the consensus context labels, and the gesture labels in the JIGSAWS dataset. Our results show that our segmentation models achieve state-of-the-art performance in recognizing needle and thread in Suturing and we can automatically detect important surgical states with high agreement with crowd-sourced labels (e.g., contact between graspers and objects in Suturing). We also find that the FSM models are more robust to poor segmentation and labeling performance than LSTMs. Our proposed method can significantly shorten the gesture labeling process (~2.8 times).

3.6IVDec 1, 2024

Enhancing Brain Age Estimation with a Multimodal 3D CNN Approach Combining Structural MRI and AI-Synthesized Cerebral Blood Volume Data

Jordan Jomsky, Zongyu Li, Yiren Zhang et al.

The increasing global aging population necessitates improved methods to assess brain aging and its related neurodegenerative changes. Brain Age Gap Estimation (BrainAGE) offers a neuroimaging biomarker for understanding these changes by predicting brain age from MRI scans. Current approaches primarily use T1-weighted magnetic resonance imaging (T1w MRI) data, capturing only structural brain information. To address this limitation, AI-generated Cerebral Blood Volume (AICBV) data, synthesized from non-contrast MRI scans, offers functional insights by revealing subtle blood-tissue contrasts otherwise undetectable in standard imaging. We integrated AICBV with T1w MRI to predict brain age, combining both structural and functional metrics. We developed a deep learning model using a VGG-based architecture for both modalities and combined their predictions using linear regression. Our model achieved a mean absolute error (MAE) of 3.95 years and an $R^2$ of 0.943 on the test set ($n = 288$), outperforming existing models trained on similar data. We have further created gradient-based class activation maps (Grad-CAM) to visualize the regions of the brain that most influenced the model's predictions, providing interpretable insights into the structural and functional contributors to brain aging.