Towards Overcoming Data Scarcity in Nuclear Energy: A Study on Critical Heat Flux with Physics-consistent Conditional Diffusion Model
This addresses data scarcity for nuclear energy researchers by providing a method to augment limited experimental datasets, though it is incremental as it applies an existing generative model to a specific domain.
The study tackled data scarcity in nuclear energy by developing a physics-consistent conditional diffusion model to generate synthetic critical heat flux (CHF) data, showing it effectively captures empirical distributions and maintains physical consistency with acceptable uncertainty.
Deep generative modeling provides a powerful pathway to overcome data scarcity in energy-related applications where experimental data are often limited, costly, or difficult to obtain. By learning the underlying probability distribution of the training dataset, deep generative models, such as the diffusion model (DM), can generate high-fidelity synthetic samples that statistically resemble the training data. Such synthetic data generation can significantly enrich the size and diversity of the available training data, and more importantly, improve the robustness of downstream machine learning models in predictive tasks. The objective of this paper is to investigate the effectiveness of DM for overcoming data scarcity in nuclear energy applications. By leveraging a public dataset on critical heat flux (CHF) that cover a wide range of commercial nuclear reactor operational conditions, we developed a DM that can generate an arbitrary amount of synthetic samples for augmenting of the CHF dataset. Since a vanilla DM can only generate samples randomly, we also developed a conditional DM capable of generating targeted CHF data under user-specified thermal-hydraulic conditions. The performance of the DM was evaluated based on their ability to capture empirical feature distributions and pair-wise correlations, as well as to maintain physical consistency. The results showed that both the DM and conditional DM can successfully generate realistic and physics-consistent CHF data. Furthermore, uncertainty quantification was performed to establish confidence in the generated data. The results demonstrated that the conditional DM is highly effective in augmenting CHF data while maintaining acceptable levels of uncertainty.