Generative Regression for Left Ventricular Ejection Fraction Estimation from Echocardiography Video
This addresses the challenge of accurate and interpretable LVEF estimation for medical diagnosis, particularly in noisy or pathological scenarios, representing a paradigm shift from deterministic to probabilistic methods.
The paper tackled the problem of estimating Left Ventricular Ejection Fraction (LVEF) from echocardiography videos, which is ill-posed due to noise and ambiguity, by proposing a generative regression approach called MCSDR that models the posterior distribution of LVEF; it achieved state-of-the-art performance on multiple datasets including EchoNet-Dynamic, EchoNet-Pediatric, and CAMUS.
Estimating Left Ventricular Ejection Fraction (LVEF) from echocardiograms constitutes an ill-posed inverse problem. Inherent noise, artifacts, and limited viewing angles introduce ambiguity, where a single video sequence may map not to a unique ground truth, but rather to a distribution of plausible physiological values. Prevailing deep learning approaches typically formulate this task as a standard regression problem that minimizes the Mean Squared Error (MSE). However, this paradigm compels the model to learn the conditional expectation, which may yield misleading predictions when the underlying posterior distribution is multimodal or heavy-tailed -- a common phenomenon in pathological scenarios. In this paper, we investigate the paradigm shift from deterministic regression toward generative regression. We propose the Multimodal Conditional Score-based Diffusion model for Regression (MCSDR), a probabilistic framework designed to model the continuous posterior distribution of LVEF conditioned on echocardiogram videos and patient demographic attribute priors. Extensive experiments conducted on the EchoNet-Dynamic, EchoNet-Pediatric, and CAMUS datasets demonstrate that MCSDR achieves state-of-the-art performance. Notably, qualitative analysis reveals that the generation trajectories of our model exhibit distinct behaviors in cases characterized by high noise or significant physiological variability, thereby offering a novel layer of interpretability for AI-aided diagnosis.