Controllable Lung Nodule Synthesis via Histogram-Regularized Latent Diffusion Models
This work provides a method for generating realistic lung nodules, which can help address the scarcity of diverse, annotated pulmonary nodule datasets for automated diagnosis systems, particularly benefiting the development of systems for underrepresented nodule subtypes.
This paper introduces a controllable latent diffusion model that synthesizes pulmonary nodules within full 3D CT volumes. It addresses the issue of over-smoothed texture profiles and underrepresentation of distinct nodule subtypes by incorporating a histogram-based regularization term during the generative process, which constrains voxel intensity distributions. The model achieves strong visual realism and improves performance in downstream clinical tasks, especially for underrepresented nodule subtypes.
While automated diagnosis systems have achieved remarkable success in computed tomography (CT)-based lung cancer screening, their development remains limited by the scarcity of diverse, annotated pulmonary nodule datasets. Diffusion-based generative models offer a promising strategy for data synthesis; however, many existing conditional approaches primarily optimize spatial reconstruction losses, which encourage voxel-wise similarity but may inadequately constrain lesion-level intensity distributions. As a result, these methods may produce over-smoothed texture profiles and underrepresent the distinct attenuation characteristics of different nodule subtypes, including solid, part-solid, and ground-glass nodules. To address this challenge, we propose a controllable latent diffusion model that synthesizes pulmonary nodules within full 3D CT volumes while accurately modeling nodule-specific intensity distributions. Specifically, rather than relying solely on spatial losses, we introduce a histogram-based regularization term that constrains voxel intensity distributions during the generative process. The model combines subtype, spatial mask, and Hounsfield unit (HU) histogram conditioning with the differentiable feature-space histogram regularization term to better align lesion-level intensity distributions, improving the visual plausibility and subtype consistency of synthesized nodules. Extensive experiments on lung CT data demonstrate that our framework achieves strong visual realism, validated through both quantitative metrics and a visual Turing test. Furthermore, when used for data augmentation, the generated nodules improve performance in downstream clinical tasks, particularly for underrepresented nodule subtypes, and show a potential benefit for subtype-informed malignancy classification.