Domain Adaptive Graph Neural Networks for Constraining Cosmological Parameters Across Multiple Data Sets
This addresses the challenge of robust deep learning for cosmological parameter estimation across multiple simulations, a step toward real cosmic survey data, but is incremental as it adapts existing domain adaptation methods to a specific domain.
The paper tackles the problem of cosmological models trained on one simulation suite performing poorly on another due to differences in subgrid physics and numerical approximations, and shows that Domain Adaptive Graph Neural Networks (DA-GNNs) achieve up to 28% better relative error and nearly an order of magnitude better χ² in cross-dataset tasks.
Deep learning models have been shown to outperform methods that rely on summary statistics, like the power spectrum, in extracting information from complex cosmological data sets. However, due to differences in the subgrid physics implementation and numerical approximations across different simulation suites, models trained on data from one cosmological simulation show a drop in performance when tested on another. Similarly, models trained on any of the simulations would also likely experience a drop in performance when applied to observational data. Training on data from two different suites of the CAMELS hydrodynamic cosmological simulations, we examine the generalization capabilities of Domain Adaptive Graph Neural Networks (DA-GNNs). By utilizing GNNs, we capitalize on their capacity to capture structured scale-free cosmological information from galaxy distributions. Moreover, by including unsupervised domain adaptation via Maximum Mean Discrepancy (MMD), we enable our models to extract domain-invariant features. We demonstrate that DA-GNN achieves higher accuracy and robustness on cross-dataset tasks (up to $28\%$ better relative error and up to almost an order of magnitude better $χ^2$). Using data visualizations, we show the effects of domain adaptation on proper latent space data alignment. This shows that DA-GNNs are a promising method for extracting domain-independent cosmological information, a vital step toward robust deep learning for real cosmic survey data.