GACOIMLGDec 10, 2020

A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest

arXiv:2012.05928v22 citations
AI Analysis

This work provides a fast and accurate method for deriving galaxy property PDFs, which is significant for cosmological and galaxy evolution studies, particularly for large surveys like DES.

This paper demonstrates that Random Forest (RF) machine learning can accurately derive joint redshift-stellar mass probability distribution functions (PDFs) for galaxies, even with limited photometric bands. The RF-based method outperforms template-fitting code BAGPIPES on all predefined performance metrics and can compute joint PDFs for a million galaxies in under 6 minutes.

We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes