GAEPIMSRLGFeb 14, 2023

Parameters for > 300 million Gaia stars: Bayesian inference vs. machine learning

arXiv:2302.06995v11 citationsh-index: 71
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of efficiently processing large-scale astronomical data for researchers, though it appears incremental as it applies existing machine learning techniques to new data.

The authors tackled the challenge of estimating stellar parameters from the vast and complex Gaia DR3 dataset, showing that simple machine learning methods like neural networks or tree-based algorithms can achieve competitive results compared to traditional Bayesian isochrone fitting, even down to faint magnitudes.

The Gaia Data Release 3 (DR3), published in June 2022, delivers a diverse set of astrometric, photometric, and spectroscopic measurements for more than a billion stars. The wealth and complexity of the data makes traditional approaches for estimating stellar parameters for the full Gaia dataset almost prohibitive. We have explored different supervised learning methods for extracting basic stellar parameters as well as distances and line-of-sight extinctions, given spectro-photo-astrometric data (including also the new Gaia XP spectra). For training we use an enhanced high-quality dataset compiled from Gaia DR3 and ground-based spectroscopic survey data covering the whole sky and all Galactic components. We show that even with a simple neural-network architecture or tree-based algorithm (and in the absence of Gaia XP spectra), we succeed in predicting competitive results (compared to Bayesian isochrone fitting) down to faint magnitudes. We will present a new Gaia DR3 stellar-parameter catalogue obtained using the currently best-performing machine-learning algorithm for tabular data, XGBoost, in the near future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes