CVApr 20, 2025

Causal Disentanglement for Robust Long-tail Medical Image Generation

Weizhi Nie, Zichun Zhang, Weijie Wang, Bruno Lepri, Anan Liu, Nicu Sebe

arXiv:2504.14450v22 citationsh-index: 31

Originality Incremental advance

AI Analysis

This work addresses data scarcity and interpretability issues in medical imaging, particularly for long-tailed distributions, though it appears incremental by combining existing techniques like causal disentanglement and diffusion models.

The paper tackles the challenge of generating high-quality, diverse medical images from limited data with imbalanced class distributions by proposing a framework that uses causal disentanglement to separate pathological and structural features and text-guided modeling to regulate counterfactual image generation. The result includes improved structural stability and enhanced clinical relevance, with performance gains on long-tailed categories through initial noise optimization.

Counterfactual medical image generation effectively addresses data scarcity and enhances the interpretability of medical images. However, due to the complex and diverse pathological features of medical images and the imbalanced class distribution in medical data, generating high-quality and diverse medical images from limited data is significantly challenging. Additionally, to fully leverage the information in limited data, such as anatomical structure information and generate more structurally stable medical images while avoiding distortion or inconsistency. In this paper, in order to enhance the clinical relevance of generated data and improve the interpretability of the model, we propose a novel medical image generation framework, which generates independent pathological and structural features based on causal disentanglement and utilizes text-guided modeling of pathological features to regulate the generation of counterfactual images. First, we achieve feature separation through causal disentanglement and analyze the interactions between features. Here, we introduce group supervision to ensure the independence of pathological and identity features. Second, we leverage a diffusion model guided by pathological findings to model pathological features, enabling the generation of diverse counterfactual images. Meanwhile, we enhance accuracy by leveraging a large language model to extract lesion severity and location from medical reports. Additionally, we improve the performance of the latent diffusion model on long-tailed categories through initial noise optimization.

View on arXiv PDF

Similar