Xavier Intes

CV
h-index32
8papers
82citations
Novelty38%
AI Score40

8 Papers

CVMar 17, 2022
Video-based Formative and Summative Assessment of Surgical Tasks using Deep Learning

Erim Yanik, Uwe Kruger, Xavier Intes et al.

To ensure satisfactory clinical outcomes, surgical skill assessment must be objective, time-efficient, and preferentially automated - none of which is currently achievable. Video-based assessment (VBA) is being deployed in intraoperative and simulation settings to evaluate technical skill execution. However, VBA remains manually- and time-intensive and prone to subjective interpretation and poor inter-rater reliability. Herein, we propose a deep learning (DL) model that can automatically and objectively provide a high-stakes summative assessment of surgical skill execution based on video feeds and low-stakes formative assessment to guide surgical skill acquisition. Formative assessment is generated using heatmaps of visual features that correlate with surgical performance. Hence, the DL model paves the way to the quantitative and reproducible evaluation of surgical tasks from videos with the potential for broad dissemination in surgical training, certification, and credentialing.

CVDec 16, 2022
One-shot skill assessment in high-stakes domains with limited data via meta learning

Erim Yanik, Steven Schwaitzberg, Gene Yang et al.

Deep Learning (DL) has achieved robust competency assessment in various high-stakes fields. However, the applicability of DL models is often hampered by their substantial data requirements and confinement to specific training domains. This prevents them from transitioning to new tasks where data is scarce. Therefore, domain adaptation emerges as a critical element for the practical implementation of DL in real-world scenarios. Herein, we introduce A-VBANet, a novel meta-learning model capable of delivering domain-agnostic skill assessment via one-shot learning. Our methodology has been tested by assessing surgical skills on five laparoscopic and robotic simulators and real-life laparoscopic cholecystectomy. Our model successfully adapted with accuracies up to 99.5% in one-shot and 99.9% in few-shot settings for simulated tasks and 89.7% for laparoscopic cholecystectomy. This study marks the first instance of a domain-agnostic methodology for skill assessment in critical fields setting a precedent for the broad application of DL across diverse real-life domains with limited data.

MLDec 28, 2025
JADAI: Jointly Amortizing Adaptive Design and Bayesian Inference

Niels Bracher, Lars Kühmichel, Desi R. Ivanova et al.

We consider problems of parameter estimation where design variables can be actively optimized to maximize information gain. To this end, we introduce JADAI, a framework that jointly amortizes Bayesian adaptive design and inference by training a policy, a history network, and an inference network end-to-end. The networks minimize a generic loss that aggregates incremental reductions in posterior error along experimental sequences. Inference networks are instantiated with diffusion-based posterior estimators that can approximate high-dimensional and multimodal posteriors at every experimental step. Across standard adaptive design benchmarks, JADAI achieves superior or competitive performance.

50.2LGApr 26
Inverting Foundation Models of Brain Function with Simulation-Based Inference

Niels Bracher, Xavier Intes, Stefan T. Radev

Foundation models of brain activity promise a new frontier for in silico neuroscience by emulating neural responses to complex stimuli across tasks and modalities. A natural next step is to ask whether these models can also be used in reverse. Can we recover a stimulus or its properties from synthetic brain activity? We study this question in a proof-of-concept setting using TRIBEv2. We pair the brain emulator with large language models (LLMs) that generate news headlines from linguistic parameters such as valence, arousal, and dominance. We then use simulation-based inference to learn a probabilistic mapping from brain maps to latent stimulus parameters. Our results show that these parameters can be recovered from predicted brain maps, validating the quality of neural encodings. They also show that LLMs can serve as controllable stimulus generators for simulated experiments. Together, these findings provide a step toward decoding and inverse design with foundation brain models.

IVNov 25, 2024
Enhancing Fluorescence Lifetime Parameter Estimation Accuracy with Differential Transformer Based Deep Learning Model Incorporating Pixelwise Instrument Response Function

Ismail Erbas, Vikas Pandey, Navid Ibtehaj Nizam et al.

Fluorescence Lifetime Imaging (FLI) is a critical molecular imaging modality that provides unique information about the tissue microenvironment, which is invaluable for biomedical applications. FLI operates by acquiring and analyzing photon time-of-arrival histograms to extract quantitative parameters associated with temporal fluorescence decay. These histograms are influenced by the intrinsic properties of the fluorophore, instrument parameters, time-of-flight distributions associated with pixel-wise variations in the topographic and optical characteristics of the sample. Recent advancements in Deep Learning (DL) have enabled improved fluorescence lifetime parameter estimation. However, existing models are primarily designed for planar surface samples, limiting their applicability in translational scenarios involving complex surface profiles, such as \textit{in-vivo} whole-animal or imaged guided surgical applications. To address this limitation, we present MFliNet (Macroscopic FLI Network), a novel DL architecture that integrates the Instrument Response Function (IRF) as an additional input alongside experimental photon time-of-arrival histograms. Leveraging the capabilities of a Differential Transformer encoder-decoder architecture, MFliNet effectively focuses on critical input features, such as variations in photon time-of-arrival distributions. We evaluate MFliNet using rigorously designed tissue-mimicking phantoms and preclinical in-vivo cancer xenograft models. Our results demonstrate the model's robustness and suitability for complex macroscopic FLI applications, offering new opportunities for advanced biomedical imaging in diverse and challenging settings.

CVMay 23, 2025
EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media

Ismail Erbas, Ferhat Demirkiran, Karthik Swaminathan et al.

Fluorescence LiDAR (FLiDAR), a Light Detection and Ranging (LiDAR) technology employed for distance and depth estimation across medical, automotive, and other fields, encounters significant computational challenges in scattering media. The complex nature of the acquired FLiDAR signal, particularly in such environments, makes isolating photon time-of-flight (related to target depth) and intrinsic fluorescence lifetime exceptionally difficult, thus limiting the effectiveness of current analytical and computational methodologies. To overcome this limitation, we present a Physics-Guided Mixture-of-Experts (MoE) framework tailored for specialized modeling of diverse temporal components. In contrast to the conventional MoE approaches our expert models are informed by underlying physics, such as the radiative transport equation governing photon propagation in scattering media. Central to our approach is EvidenceMoE, which integrates Evidence-Based Dirichlet Critics (EDCs). These critic models assess the reliability of each expert's output by providing per-expert quality scores and corrective feedback. A Decider Network then leverages this information to fuse expert predictions into a robust final estimate adaptively. We validate our method using realistically simulated Fluorescence LiDAR (FLiDAR) data for non-invasive cancer cell depth detection generated from photon transport models in tissue. Our framework demonstrates strong performance, achieving a normalized root mean squared error (NRMSE) of 0.030 for depth estimation and 0.074 for fluorescence lifetime.

AIApr 16, 2024
Cognitive-Motor Integration in Assessing Bimanual Motor Skills

Erim Yanik, Xavier Intes, Suvranu De

Accurate assessment of bimanual motor skills is essential across various professions, yet, traditional methods often rely on subjective assessments or focus solely on motor actions, overlooking the integral role of cognitive processes. This study introduces a novel approach by leveraging deep neural networks (DNNs) to analyze and integrate both cognitive decision-making and motor execution. We tested this methodology by assessing laparoscopic surgery skills within the Fundamentals of Laparoscopic Surgery program, which is a prerequisite for general surgery certification. Utilizing video capture of motor actions and non-invasive functional near-infrared spectroscopy (fNIRS) for measuring neural activations, our approach precisely classifies subjects by expertise level and predicts FLS behavioral performance scores, significantly surpassing traditional single-modality assessments.

CVMar 3, 2021
Deep Neural Networks for the Assessment of Surgical Skills: A Systematic Review

Erim Yanik, Xavier Intes, Uwe Kruger et al.

Surgical training in medical school residency programs has followed the apprenticeship model. The learning and assessment process is inherently subjective and time-consuming. Thus, there is a need for objective methods to assess surgical skills. Here, we use the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to systematically survey the literature on the use of Deep Neural Networks for automated and objective surgical skill assessment, with a focus on kinematic data as putative markers of surgical competency. There is considerable recent interest in deep neural networks (DNN) due to the availability of powerful algorithms, multiple datasets, some of which are publicly available, as well as efficient computational hardware to train and host them. We have reviewed 530 papers, of which we selected 25 for this systematic review. Based on this review, we concluded that DNNs are powerful tools for automated, objective surgical skill assessment using both kinematic and video data. The field would benefit from large, publicly available, annotated datasets that are representative of the surgical trainee and expert demographics and multimodal data beyond kinematics and videos.