CV IVNov 1, 2025

Challenging DINOv3 Foundation Model under Low Inter-Class Variability: A Case Study on Fetal Brain Ultrasound

Edoardo Conti, Riccardo Rosati, Lorenzo Federici, Adriano Mancini, Maria Chiara Fiorentin

arXiv:2511.01915v11 citationsh-index: 3

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of reliable biometric assessment in fetal brain ultrasound imaging, where subtle anatomical differences are critical for clinical diagnosis, though it is incremental as it applies an existing method to a specific domain problem.

This study evaluated the DINOv3 foundation model's ability to distinguish anatomically similar fetal brain ultrasound planes (transthalamic, transventricular, transcerebellar) under low inter-class variability, finding that domain-specific pretraining on fetal ultrasound data improved weighted F1-scores by up to 20% compared to natural-image initialization.

Purpose: This study provides the first comprehensive evaluation of foundation models in fetal ultrasound (US) imaging under low inter-class variability conditions. While recent vision foundation models such as DINOv3 have shown remarkable transferability across medical domains, their ability to discriminate anatomically similar structures has not been systematically investigated. We address this gap by focusing on fetal brain standard planes--transthalamic (TT), transventricular (TV), and transcerebellar (TC)--which exhibit highly overlapping anatomical features and pose a critical challenge for reliable biometric assessment. Methods: To ensure a fair and reproducible evaluation, all publicly available fetal ultrasound datasets were curated and aggregated into a unified multicenter benchmark, FetalUS-188K, comprising more than 188,000 annotated images from heterogeneous acquisition settings. DINOv3 was pretrained in a self-supervised manner to learn ultrasound-aware representations. The learned features were then evaluated through standardized adaptation protocols, including linear probing with frozen backbone and full fine-tuning, under two initialization schemes: (i) pretraining on FetalUS-188K and (ii) initialization from natural-image DINOv3 weights. Results: Models pretrained on fetal ultrasound data consistently outperformed those initialized on natural images, with weighted F1-score improvements of up to 20 percent. Domain-adaptive pretraining enabled the network to preserve subtle echogenic and structural cues crucial for distinguishing intermediate planes such as TV. Conclusion: Results demonstrate that generic foundation models fail to generalize under low inter-class variability, whereas domain-specific pretraining is essential to achieve robust and clinically reliable representations in fetal brain ultrasound imaging.

View on arXiv PDF

Similar