Reproducing Bayesian Posterior Distributions for Exoplanet Atmospheric Parameter Retrievals with a Machine Learning Surrogate Model
This work addresses the computational challenge of parameter retrieval in exoplanet atmospheric analysis for astronomers, but it is incremental as it applies existing methods to a specific domain.
The authors tackled the problem of reproducing Bayesian posterior distributions for exoplanet atmospheric parameters by developing a machine-learning surrogate model, which achieved success as a winning solution in the 2023 Ariel Machine Learning Data Challenge.
We describe a machine-learning-based surrogate model for reproducing the Bayesian posterior distributions for exoplanet atmospheric parameters derived from transmission spectra of transiting planets with typical retrieval software such as TauRex. The model is trained on ground truth distributions for seven parameters: the planet radius, the atmospheric temperature, and the mixing ratios for five common absorbers: $H_2O$, $CH_4$, $NH_3$, $CO$ and $CO_2$. The model performance is enhanced by domain-inspired preprocessing of the features and the use of semi-supervised learning in order to leverage the large amount of unlabelled training data available. The model was among the winning solutions in the 2023 Ariel Machine Learning Data Challenge.