Wouter Willaert

CV
h-index27
5papers
6citations
Novelty31%
AI Score25

5 Papers

CVOct 18, 2023
Towards Abdominal 3-D Scene Rendering from Laparoscopy Surgical Videos using NeRFs

Khoa Tuan Nguyen, Francesca Tozzi, Nikdokht Rashidian et al.

Given that a conventional laparoscope only provides a two-dimensional (2-D) view, the detection and diagnosis of medical ailments can be challenging. To overcome the visual constraints associated with laparoscopy, the use of laparoscopic images and videos to reconstruct the three-dimensional (3-D) anatomical structure of the abdomen has proven to be a promising approach. Neural Radiance Fields (NeRFs) have recently gained attention thanks to their ability to generate photorealistic images from a 3-D static scene, thus facilitating a more comprehensive exploration of the abdomen through the synthesis of new views. This distinguishes NeRFs from alternative methods such as Simultaneous Localization and Mapping (SLAM) and depth estimation. In this paper, we present a comprehensive examination of NeRFs in the context of laparoscopy surgical videos, with the goal of rendering abdominal scenes in 3-D. Although our experimental results are promising, the proposed approach encounters substantial challenges, which require further exploration in future research.

CVApr 28, 2025
Boosting 3D Liver Shape Datasets with Diffusion Models and Implicit Neural Representations

Khoa Tuan Nguyen, Francesca Tozzi, Wouter Willaert et al.

While the availability of open 3D medical shape datasets is increasing, offering substantial benefits to the research community, we have found that many of these datasets are, unfortunately, disorganized and contain artifacts. These issues limit the development and training of robust models, particularly for accurate 3D reconstruction tasks. In this paper, we examine the current state of available 3D liver shape datasets and propose a solution using diffusion models combined with implicit neural representations (INRs) to augment and expand existing datasets. Our approach utilizes the generative capabilities of diffusion models to create realistic, diverse 3D liver shapes, capturing a wide range of anatomical variations and addressing the problem of data scarcity. Experimental results indicate that our method enhances dataset diversity, providing a scalable solution to improve the accuracy and reliability of 3D liver reconstruction and generation in medical applications. Finally, we suggest that diffusion models can also be applied to other downstream tasks in 3D medical imaging.

CVFeb 28, 2025
Revisiting the Evaluation Bias Introduced by Frame Sampling Strategies in Surgical Video Segmentation Using SAM2

Utku Ozbulak, Seyed Amir Mousavi, Francesca Tozzi et al.

Real-time video segmentation is a promising opportunity for AI-assisted surgery, offering intraoperative guidance by identifying tools and anatomical structures. Despite growing interest in surgical video segmentation, annotation protocols vary widely across datasets -- some provide dense, frame-by-frame labels, while others rely on sparse annotations sampled at low frame rates such as 1 FPS. In this study, we investigate how such inconsistencies in annotation density and frame rate sampling influence the evaluation of zero-shot segmentation models, using SAM2 as a case study for cholecystectomy procedures. Surprisingly, we find that under conventional sparse evaluation settings, lower frame rates can appear to outperform higher ones due to a smoothing effect that conceals temporal inconsistencies. However, when assessed under real-time streaming conditions, higher frame rates yield superior segmentation stability, particularly for dynamic objects like surgical graspers. To understand how these differences align with human perception, we conducted a survey among surgeons, nurses, and machine learning engineers and found that participants consistently preferred high-FPS segmentation overlays, reinforcing the importance of evaluating every frame in real-time applications rather than relying on sparse sampling strategies. Our findings highlight the risk of evaluation bias that is introduced by inconsistent dataset protocols and bring attention to the need for temporally fair benchmarking in surgical video AI.

LGAug 4, 2025
Toward Using Machine Learning as a Shape Quality Metric for Liver Point Cloud Generation

Khoa Tuan Nguyen, Gaeun Oh, Ho-min Park et al.

While 3D medical shape generative models such as diffusion models have shown promise in synthesizing diverse and anatomically plausible structures, the absence of ground truth makes quality evaluation challenging. Existing evaluation metrics commonly measure distributional distances between training and generated sets, while the medical field requires assessing quality at the individual level for each generated shape, which demands labor-intensive expert review. In this paper, we investigate the use of classical machine learning (ML) methods and PointNet as an alternative, interpretable approach for assessing the quality of generated liver shapes. We sample point clouds from the surfaces of the generated liver shapes, extract handcrafted geometric features, and train a group of supervised ML and PointNet models to classify liver shapes as good or bad. These trained models are then used as proxy discriminators to assess the quality of synthetic liver shapes produced by generative models. Our results show that ML-based shape classifiers provide not only interpretable feedback but also complementary insights compared to expert evaluation. This suggests that ML classifiers can serve as lightweight, task-relevant quality metrics in 3D organ shape generation, supporting more transparent and clinically aligned evaluation protocols in medical shape modeling.

CVMar 4, 2025
One Patient's Annotation is Another One's Initialization: Towards Zero-Shot Surgical Video Segmentation with Cross-Patient Initialization

Seyed Amir Mousavi, Utku Ozbulak, Francesca Tozzi et al.

Video object segmentation is an emerging technology that is well-suited for real-time surgical video segmentation, offering valuable clinical assistance in the operating room by ensuring consistent frame tracking. However, its adoption is limited by the need for manual intervention to select the tracked object, making it impractical in surgical settings. In this work, we tackle this challenge with an innovative solution: using previously annotated frames from other patients as the tracking frames. We find that this unconventional approach can match or even surpass the performance of using patients' own tracking frames, enabling more autonomous and efficient AI-assisted surgical workflows. Furthermore, we analyze the benefits and limitations of this approach, highlighting its potential to enhance segmentation accuracy while reducing the need for manual input. Our findings provide insights into key factors influencing performance, offering a foundation for future research on optimizing cross-patient frame selection for real-time surgical video analysis.