Zaiyang Guo

CV
h-index6
3papers
5citations
Novelty33%
AI Score36

3 Papers

43.5CVMay 22
EchoVQA: Enabling Conversational Assistance for Point-of-Care Cardiac Ultrasound

Filippos Bellos, Yutong Li, Jessie N Dong et al.

Point-of-care transthoracic echocardiography (TTE) enables cardiac assessment in virtually any clinical setting, yet its diagnostic utility remains constrained by the expertise required for image acquisition and interpretation. Visual question answering (VQA) offers a promising paradigm for bridging this expertise gap through interactive clinical assistance, but existing echocardiography VQA datasets are limited in scale, restricted to high-quality images, and only cover a few views. We introduce EchoVQA, the first large-scale VQA dataset for echocardiography, comprising 14,299 images and 74,819 question-answer pairs. The dataset integrates public sources (EchoNet-Dynamic, CAMUS) with our own point-of-care acquisitions from two handheld probes (Lumify, Clarius), spanning diverse views and including both high-quality and suboptimal images. Uniquely, EchoVQA includes acquisition guidance questions to help users optimize transducer positioning toward a diagnostic apical 4-chamber view for left ventricular ejection fraction estimation -- a challenging task for novice operators in point-of-care settings. We further develop a parameter-efficient method based on multimodal learnable prompts achieving state-of-the-art performance on most benchmarks, including EchoVQA, with significantly less trainable parameters than existing state-of-the-art approaches.

8.6CVMar 28
Follow Your Heart: Landmark-Guided Transducer Pose Scoring for Point-of-Care Echocardiography

Zaiyang Guo, Jessie N. Dong, Filippos Bellos et al.

Point-of-care transthoracic echocardiography (TTE) makes it possible to assess a patient's cardiac function in almost any setting. A critical step in the TTE exam is acquisition of the apical 4-chamber (A4CH) view, which is used to evaluate clinically impactful measurements such as left ventricular ejection fraction (LVEF). However, optimizing transducer pose for high-quality image acquisition and subsequent measurement is a challenging task, particularly for novice users. In this work, we present a multi-task network that provides feedback cues for A4CH view acquisition and automatically estimates LVEF in high-quality A4CH images. The network cascades a transducer pose scoring module and an uncertainty-aware LV landmark detector with automated LVEF estimation. A strength is that network training and inference do not require cumbersome or costly setups for transducer position tracking. We evaluate performance on point-of-care TTE data acquired with a spatially dense "sweep" protocol around the optimal A4CH view. The results demonstrate the network's ability to determine when the transducer pose is on target, close to target, or far from target based on the images alone, while generating visual landmark cues that guide anatomical interpretation and orientation. In conclusion, we demonstrate a promising strategy to provide guidance for A4CH view acquisition, which may be useful when deploying point-of-care TTE in limited resource settings.

IVFeb 13, 2025
Towards Patient-Specific Surgical Planning for Bicuspid Aortic Valve Repair: Fully Automated Segmentation of the Aortic Valve in 4D CT

Zaiyang Guo, Ningjun J Dong, Harold Litt et al.

The bicuspid aortic valve (BAV) is the most prevalent congenital heart defect and may require surgery for complications such as stenosis, regurgitation, and aortopathy. BAV repair surgery is effective but challenging due to the heterogeneity of BAV morphology. Multiple imaging modalities can be employed to assist the quantitative assessment of BAVs for surgical planning. Contrast-enhanced 4D computed tomography (CT) produces volumetric temporal sequences with excellent contrast and spatial resolution. Segmentation of the aortic cusps and root in these images is an essential step in creating patient specific models for visualization and quantification. While deep learning-based methods are capable of fully automated segmentation, no BAV-specific model exists. Among valve segmentation studies, there has been limited quantitative assessment of the clinical usability of the segmentation results. In this work, we developed a fully automated multi-label BAV segmentation pipeline based on nnU-Net. The predicted segmentations were used to carry out surgically relevant morphological measurements including geometric cusp height, commissural angle and annulus diameter, and the results were compared against manual segmentation. Automated segmentation achieved average Dice scores of over 0.7 and symmetric mean distance below 0.7 mm for all three aortic cusps and the root wall. Clinically relevant benchmarks showed good consistency between manual and predicted segmentations. Overall, fully automated BAV segmentation of 3D frames in 4D CT can produce clinically usable measurements for surgical risk stratification, but the temporal consistency of segmentations needs to be improved.