CVJul 31, 2025

Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis

arXiv:2507.23652v114.46 citationsh-index: 3Has CodeMICCAI

Originality Incremental advance

AI Analysis

This addresses privacy and labeling constraints in medical imaging by enabling efficient, privacy-preserving image generation for segmentation tasks, though it is incremental as it builds on existing diffusion models.

The paper tackles the problem of precise lesion-mask alignment in medical image synthesis by proposing Adaptively Distilled ControlNet, which accelerates training and improves segmentation model performance, achieving mDice/mIoU gains of 2.4%/4.2% on KiTS19 and 2.6%/3.5% on Polyps.

Medical image annotation is constrained by privacy concerns and labor-intensive labeling, significantly limiting the performance and generalization of segmentation models. While mask-controllable diffusion models excel in synthesis, they struggle with precise lesion-mask alignment. We propose \textbf{Adaptively Distilled ControlNet}, a task-agnostic framework that accelerates training and optimization through dual-model distillation. Specifically, during training, a teacher model, conditioned on mask-image pairs, regularizes a mask-only student model via predicted noise alignment in parameter space, further enhanced by adaptive regularization based on lesion-background ratios. During sampling, only the student model is used, enabling privacy-preserving medical image generation. Comprehensive evaluations on two distinct medical datasets demonstrate state-of-the-art performance: TransUNet improves mDice/mIoU by 2.4%/4.2% on KiTS19, while SANet achieves 2.6%/3.5% gains on Polyps, highlighting its effectiveness and superiority. Code is available at GitHub.

View on arXiv PDF

Similar