Hybrid Diffusion Model for Breast Ultrasound Image Augmentation
For medical imaging researchers, this work addresses the low-fidelity problem in ultrasound data augmentation, but the improvement is incremental as it combines existing techniques.
The paper proposes a hybrid diffusion-based augmentation framework for breast ultrasound images that combines text-to-image generation with image-to-image refinement, LoRA, and textual inversion. The method reduces FID from 45.97 to 33.29 compared to Stable Diffusion v1.5, improving visual fidelity while maintaining classification performance.
We propose a hybrid diffusion-based augmentation framework to overcome the critical challenge of ultrasound data augmentation in breast ultrasound (BUS) datasets. Unlike conventional diffusion-based augmentations, our approach improves visual fidelity and preserves ultrasound texture by combining text-to-image generation with image-to-image (img2img) refinement, as well as fine-tuning with low-rank adaptation (LoRA) and textual inversion (TI). Our method generated realistic, class-consistent images on an open-source Kaggle breast ultrasound image dataset (BUSI). Compared to the Stable Diffusion v1.5 baseline, incorporating TI and img2img refinement reduced the Frechet Inception Distance (FID) from 45.97 to 33.29, demonstrating a substantial gain in fidelity while maintaining comparable downstream classification performance. Overall, the proposed framework effectively mitigates the low-fidelity limitations of synthetic ultrasound images and enhances the quality of augmentation for robust diagnostic modeling.