21.2IVMay 8Code
A Paired Point-of-Care Ultrasound Dataset for Image Quality Enhancement and Benchmarking via a cGAN BaselineLennard M. van Karnenbeek, Hilde G. A. van der Pol, Mark Wijkhuizen et al.
Purpose: We aim to enhance the image quality of point-of-care ultrasound (POCUS) devices using deep learning and a novel paired dataset of POCUS and high-end ultrasound images. Approach: We collected the first accurately paired dataset using a custom-built automated gantry system of low-end POCUS and high-end ultrasound images. A conditional generative adversarial network (cGAN) was utilized based on the pix2pix architecture, with a U-Net generator that incorporates both L1 and structural similarity index (SSIM) losses to improve perceptual quality. Pretraining on a simulation dataset further boosts performance. Evaluation was performed on 1064 paired ex vivo tissue and phantom ultrasound image sets. Results: Our approach improves the SSIM from 0.29 to 0.54 and PSNR from 19.16 dB to 22.41 dB. No-reference metrics also indicate substantial enhancement, with the Natural Image Quality Evaluator (NIQE) and Perception-based Image Quality Evaluator (PIQE) scores dropping from 7.95 to 4.44 and 31.12 to 19.99, respectively. Conclusions: This work presents the first publicly available accurately paired dataset of low-end POCUS to high end ultrasound images. Additionally, our results demonstrate the potential of the proposed framework to overcome hardware limitations of handheld POCUS, enhancing its diagnostic value in low-resource and point-of-care settings. The POCUS-IQ Dataset is publicly available at https://github.com/NKI-MedTech-AI/POCUS-IQ.
CVAug 15, 2025
Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue ReconstructionMuzammil Khan, Enzo Kerkhof, Matteo Fusaglia et al.
Accurate endoscope pose estimation and 3D tissue surface reconstruction significantly enhances monocular minimally invasive surgical procedures by enabling accurate navigation and improved spatial awareness. However, monocular endoscope pose estimation and tissue reconstruction face persistent challenges, including depth ambiguity, physiological tissue deformation, inconsistent endoscope motion, limited texture fidelity, and a restricted field of view. To overcome these limitations, a unified framework for monocular endoscopic tissue reconstruction that integrates scale-aware depth prediction with temporally-constrained perceptual refinement is presented. This framework incorporates a novel MAPIS-Depth module, which leverages Depth Pro for robust initialisation and Depth Anything for efficient per-frame depth prediction, in conjunction with L-BFGS-B optimisation, to generate pseudo-metric depth estimates. These estimates are temporally refined by computing pixel correspondences using RAFT and adaptively blending flow-warped frames based on LPIPS perceptual similarity, thereby reducing artefacts arising from physiological tissue deformation and motion. To ensure accurate registration of the synthesised pseudo-RGBD frames from MAPIS-Depth, a novel WEMA-RTDL module is integrated, optimising both rotation and translation. Finally, truncated signed distance function-based volumetric fusion and marching cubes are applied to extract a comprehensive 3D surface mesh. Evaluations on HEVD and SCARED, with ablation and comparative analyses, demonstrate the framework's robustness and superiority over state-of-the-art methods.