CV AIApr 1

Improving Generalization of Deep Learning for Brain Metastases Segmentation Across Institutions

Yuchen Yang, Shuangyang Zhong, Haijun Yu, Langcuomu Suo, Hongbin Han, Florian Putz, Yixing Huang

arXiv:2604.003975.6h-index: 2

AI Analysis

This addresses a critical bottleneck for clinical deployment of AI-assisted segmentation in medical imaging by enabling cross-institutional generalization without target-domain labels, though it is incremental as it builds on existing domain adaptation and segmentation methods.

The paper tackled the problem of poor generalization of deep learning models for brain metastases segmentation across institutions due to data heterogeneity, and the proposed VAE-MMD domain adaptation framework improved mean F1 by 11.1%, mean surface Dice by 7.93%, and reduced mean Hausdorff distance by 65.5% compared to baseline.

Background: Deep learning has demonstrated significant potential for automated brain metastases (BM) segmentation; however, models trained at a singular institution often exhibit suboptimal performance at various sites due to disparities in scanner hardware, imaging protocols, and patient demographics. The goal of this work is to create a domain adaptation framework that will allow for BM segmentation to be used across multiple institutions. Methods: We propose a VAE-MMD preprocessing pipeline that combines variational autoencoders (VAE) with maximum mean discrepancy (MMD) loss, incorporating skip connections and self-attention mechanisms alongside nnU-Net segmentation. The method was tested on 740 patients from four public databases: Stanford, UCSF, UCLM, and PKG, evaluated by domain classifier's accuracy, sensitivity, precision, F1/F2 scores, surface Dice (sDice), and 95th percentile Hausdorff distance (HD95). Results: VAE-MMD reduced domain classifier accuracy from 0.91 to 0.50, indicating successful feature alignment across institutions. Reconstructed volumes attained a PSNR greater than 36 dB, maintaining anatomical accuracy. The combined method raised the mean F1 by 11.1% (0.700 to 0.778), the mean sDice by 7.93% (0.7121 to 0.7686), and reduced the mean HD95 by 65.5% (11.33 to 3.91 mm) across all four centers compared to the baseline nnU-Net. Conclusions: VAE-MMD effectively diminishes cross-institutional data heterogeneity and enhances BM segmentation generalization across volumetric, detection, and boundary-level metrics without necessitating target-domain labels, thereby overcoming a significant obstacle to the clinical implementation of AI-assisted segmentation.

View on arXiv PDF

Similar