Three-dimensional Conditional Diffusion Models for Cosmological 21 cm Lightcone Emulation

arXiv:2605.2901660.5h-index: 2

Predicted impact top 39% in IM · last 90 daysOriginality Synthesis-oriented

AI Analysis

This work provides a simulation-level baseline for 3D 21 cm emulation, addressing the challenge of memory constraints and skewed data distributions, but results are incremental with measurable biases remaining.

The paper develops 3D conditional diffusion models for 21 cm lightcone emulation, finding that Yeo-Johnson preprocessing with moderate amplitude compression yields the best trade-off, though generated samples still show biases in summary statistics.

We investigate conditional diffusion modeling for three-dimensional 21 cm lightcone emulation, focusing on cubes with a sky-plane size of $64\times64$ and a line-of-sight depth up to 1024 cells. Relative to earlier 2D studies, the 3D setting is substantially harder because memory limits enforce very small micro-batches while the underlying voxel distribution is highly skewed and long tailed. We perform controlled comparisons across preprocessing choices, dynamic-range compression settings, architecture depth, and training duration using $25{,}600$ training lightcones and validation ensembles at fixed parameter points. For validation, each reference parameter point contains 800 21cmFAST realizations with independent initial conditions, and we use 800 samples per model and per reference set for the reported ensemble comparisons. We evaluate generated lightcones with complementary diagnostics in both image and summary-statistic spaces: brightness-temperature slices, the global signal, the power spectrum, and reduced scattering coefficients. Across the tested configurations, preprocessing is the dominant factor governing stable training and the resulting physical fidelity. Among the configurations explored here, Yeo-Johnson preprocessing combined with moderate amplitude compression gives the most consistently favorable trade-off, with the strongest quantitative support coming from rankings based on the standard-deviation-normalized mean absolute error ($\mathrm{MAE}_{\rm std}$) of the global signal and qualitatively compatible behavior in the complementary diagnostics. At the same time, visually plausible 3D samples still retain measurable biases in two-point and higher-order statistics. We therefore view the present work as a simulation-level baseline for three-dimensional 21 cm emulation and for future studies that incorporate more realistic observational effects.

View on arXiv PDF

Similar