47.9IVMay 30
Echo-POSED: Geometric Self-Distillation for Echocardiography GuidanceElias Stenhede, Edvart Grüner Bjerke, Joanna Sulkowska et al.
We introduce Echo-POSED, a self-supervised framework for real-time transthoracic echocardiography (TTE) guidance that recommends probe adjustments directly from 2D ultrasound images, without the need for expert-labelled views or tracked probe trajectories. Instead, it trains on 2D views sliced from routinely acquired 3D echocardiography volumes, enforcing equivariance to probe motions while remaining invariant to cardiac phase, yielding a pose representation on $\mathrm{SO}(3)\times\mathrm{SO}(3)$. Across a held-out split and public external 3D--TTE datasets (including vendor shift), Echo-POSED maintains geometric consistency under virtual perturbations and enables intra- and inter-patient guidance simulations, achieving a combined mean angular error of 8.2 degrees between the guided and target views in intra-patient simulations with cardiac motion.
35.3CVMay 6Code
EchoXFlow: A Beamspace Echocardiography Dataset for Cardiac Motion, Flow, and FunctionElias Stenhede, Joanna Sulkowska, Eivind Bjørkan Orstad et al.
We introduce EchoXFlow, a clinical echocardiography dataset for learning from ultrasound in its native acquisition geometry rather than from scan-converted Cartesian videos. Existing public datasets offer limited opportunities to study cross-modal relationships between cardiac anatomy, myocardial motion, and blood flow, as Doppler is typically absent or fused as RGB overlays, and acquisitions are released after lossy vendor display processing. EchoXFlow comprises 37125 recordings from 666 routine-care examinations, preserving the timing, geometry, and modality relationships needed for physically grounded echo learning. Each recording is retained as separable modality-specific streams: temporally resolved 1D, 2D, and 3D data alongside multiple Doppler modalities, paired with a synchronized ECG. Clinical annotations span guideline-based measurements to dense 2D myocardial contours and 3D left-ventricular endocardial meshes. With its associated open-source tooling, EchoXFlow enables cross-modal, acquisition-aware learning tasks that cannot be formulated from conventional scan-converted videos alone, and serves as a testbed for 4D vision and physically grounded multi-modal learning more broadly.