IVMay 30
Echo-POSED: Geometric Self-Distillation for Echocardiography GuidanceElias Stenhede, Edvart Grüner Bjerke, Joanna Sulkowska et al.
We introduce Echo-POSED, a self-supervised framework for real-time transthoracic echocardiography (TTE) guidance that recommends probe adjustments directly from 2D ultrasound images, without the need for expert-labelled views or tracked probe trajectories. Instead, it trains on 2D views sliced from routinely acquired 3D echocardiography volumes, enforcing equivariance to probe motions while remaining invariant to cardiac phase, yielding a pose representation on $\mathrm{SO}(3)\times\mathrm{SO}(3)$. Across a held-out split and public external 3D--TTE datasets (including vendor shift), Echo-POSED maintains geometric consistency under virtual perturbations and enables intra- and inter-patient guidance simulations, achieving a combined mean angular error of 8.2 degrees between the guided and target views in intra-patient simulations with cardiac motion.
CVApr 10Code
Biomarker-Based Pretraining for Chagas Disease Screening in ElectrocardiogramsElias Stenhede, Arian Ranjbar
Chagas disease screening via ECGs is limited by scarce and noisy labels in existing datasets. We propose a biomarker-based pretraining approach, where an ECG feature extractor is first trained to predict percentile-binned blood biomarkers from the MIMIC-IV-ECG dataset. The pretrained model is then fine-tuned on Brazilian datasets for Chagas detection. Our 5-model ensemble, developed by the Ahus AIM team, achieved a challenge score of 0.269 on the hidden test set, ranking 5th in Detection of Chagas Disease from the ECG: The George B. Moody PhysioNet Challenge 2025. Source code and the model are shared on GitHub: github.com/Ahus-AIM/physionet-challenge-2025
AIMay 23
Associations between echocardiographic traits and AI-ECG predictions of heart failureElias Stenhede, Eivind Bjørkan Orstad, Torbjørn Omland et al.
Artificial intelligence-enabled electrocardiography (AI-ECG) can detect heart failure (HF), including disease not captured by left ventricular ejection fraction (LVEF), but the cardiac phenotypes underlying model predictions remain unclear. We therefore investigated whether AI-ECG-predicted HF risk aligns with established echocardiographic measures of myocardial dysfunction, remodelling, and filling pressures. We retrospectively analysed ECG and echocardiography data from 8147 patients who underwent both examinations within three days at Akershus University Hospital between 1 January 2023 and 1 June 2025. A previously validated AI-ECG model for HF detection was applied to all ECGs. Spearman's rank correlation $ρ$ quantified associations between echocardiographic parameters and AI-ECG risk. Subgroup analyses were performed by sex and left ventricular ejection fraction (LVEF). External validation included 36,286 ECG-echocardiography pairs from Columbia University Irving Medical Center. Global longitudinal strain (GLS) showed the strongest correlation ($ρ$=0.57), followed by mitral annular plane systolic excursion (MAPSE) ($ρ$=-0.49) and LVEF ($ρ$=-0.45). In patients with LVEF>50%, correlations remained substantial for GLS, MAPSE, and diastolic-related parameters. Volumetric left ventricular indices correlated less strongly in women, whereas diastolic indices showed stronger correlations in women than in men. Physiological validation showed that AI-ECG HF risk predictions align primarily with measures of systolic function, particularly global longitudinal strain, while also capturing diastolic-related abnormalities in patients with preserved LVEF. This approach may improve clinical interpretability and identify opportunities for model refinement.
CVMay 6Code
EchoXFlow: A Beamspace Echocardiography Dataset for Cardiac Motion, Flow, and FunctionElias Stenhede, Joanna Sulkowska, Eivind Bjørkan Orstad et al.
We introduce EchoXFlow, a clinical echocardiography dataset for learning from ultrasound in its native acquisition geometry rather than from scan-converted Cartesian videos. Existing public datasets offer limited opportunities to study cross-modal relationships between cardiac anatomy, myocardial motion, and blood flow, as Doppler is typically absent or fused as RGB overlays, and acquisitions are released after lossy vendor display processing. EchoXFlow comprises 37125 recordings from 666 routine-care examinations, preserving the timing, geometry, and modality relationships needed for physically grounded echo learning. Each recording is retained as separable modality-specific streams: temporally resolved 1D, 2D, and 3D data alongside multiple Doppler modalities, paired with a synchronized ECG. Clinical annotations span guideline-based measurements to dense 2D myocardial contours and 3D left-ventricular endocardial meshes. With its associated open-source tooling, EchoXFlow enables cross-modal, acquisition-aware learning tasks that cannot be formulated from conventional scan-converted videos alone, and serves as a testbed for 4D vision and physically grounded multi-modal learning more broadly.
CVNov 3, 2025
Lite ENSAM: a lightweight cancer segmentation model for 3D Computed TomographyAgnar Martin Bjørnstad, Elias Stenhede, Arian Ranjbar
Accurate tumor size measurement is a cornerstone of evaluating cancer treatment response. The most widely adopted standard for this purpose is the Response Evaluation Criteria in Solid Tumors (RECIST) v1.1, which relies on measuring the longest tumor diameter in a single plane. However, volumetric measurements have been shown to provide a more reliable assessment of treatment effect. Their clinical adoption has been limited, though, due to the labor-intensive nature of manual volumetric annotation. In this paper, we present Lite ENSAM, a lightweight adaptation of the ENSAM architecture designed for efficient volumetric tumor segmentation from CT scans annotated with RECIST annotations. Lite ENSAM was submitted to the MICCAI FLARE 2025 Task 1: Pan-cancer Segmentation in CT Scans, Subtask 2, where it achieved a Dice Similarity Coefficient (DSC) of 60.7% and a Normalized Surface Dice (NSD) of 63.6% on the hidden test set, and an average total RAM time of 50.6 GBs and an average inference time of 14.4 s on CPU on the public validation dataset.
CVOct 22, 2025Code
Digitizing Paper ECGs at Scale: An Open-Source Algorithm for Clinical ResearchElias Stenhede, Agnar Martin Bjørnstad, Arian Ranjbar
Millions of clinical ECGs exist only as paper scans, making them unusable for modern automated diagnostics. We introduce a fully automated, modular framework that converts scanned or photographed ECGs into digital signals, suitable for both clinical and research applications. The framework is validated on 37,191 ECG images with 1,596 collected at Akershus University Hospital, where the algorithm obtains a mean signal-to-noise ratio of 19.65 dB on scanned papers with common artifacts. It is further evaluated on the Emory Paper Digitization ECG Dataset, comprising 35,595 images, including images with perspective distortion, wrinkles, and stains. The model improves on the state-of-the-art in all subcategories. The full software is released as open-source, promoting reproducibility and further development. We hope the software will contribute to unlocking retrospective ECG archives and democratize access to AI-driven diagnostics.
CVSep 19, 2025
ENSAM: an efficient foundation model for interactive segmentation of 3D medical imagesElias Stenhede, Agnar Martin Bjørnstad, Arian Ranjbar
We present ENSAM (Equivariant, Normalized, Segment Anything Model), a lightweight and promptable model for universal 3D medical image segmentation. ENSAM combines a SegResNet-based encoder with a prompt encoder and mask decoder in a U-Net-style architecture, using latent cross-attention, relative positional encoding, normalized attention, and the Muon optimizer for training. ENSAM is designed to achieve good performance under limited data and computational budgets, and is trained from scratch on under 5,000 volumes from multiple modalities (CT, MRI, PET, ultrasound, microscopy) on a single 32 GB GPU in 6 hours. As part of the CVPR 2025 Foundation Models for Interactive 3D Biomedical Image Segmentation Challenge, ENSAM was evaluated on hidden test set with multimodal 3D medical images, obtaining a DSC AUC of 2.404, NSD AUC of 2.266, final DSC of 0.627, and final NSD of 0.597, outperforming two previously published baseline models (VISTA3D, SAM-Med3D) and matching the third (SegVol), surpassing its performance in final DSC but trailing behind in the other three metrics. In the coreset track of the challenge, ENSAM ranks 5th of 10 overall and best among the approaches not utilizing pretrained weights. Ablation studies confirm that our use of relative positional encodings and the Muon optimizer each substantially speed up convergence and improve segmentation quality.