IMSRLGMay 21

Spectra as Language: Large Language Models for Scalable Stellar Parameter and Abundance Inference

arXiv:2605.2216218.3
AI Analysis

For astrophysicists analyzing large spectroscopic surveys, this provides a scalable and accurate method for stellar parameter inference, addressing limitations of traditional approaches.

This work proposes a two-stage large language model framework for inferring stellar parameters and abundances from spectra, achieving accurate estimates of effective temperature, surface gravity, metallicity, and ~20 chemical elements, with scaling-law improvements as data increases.

Stellar spectra encode key information on the physical properties and chemical compositions of stars. Accurate stellar parameter determination is essential for addressing major questions such as galaxy and stellar evolution. Large-scale spectroscopic surveys have accumulated unprecedented spectral data. Traditional feature extraction or model-fitting approaches struggle with high-dimensional, massive datasets, limited generalization, and computational inefficiency. Recent advances in large language models demonstrate strong generalization and feature-learning in tasks like natural language processing, DNA/RNA sequence analysis, and protein/chemical parsing. Stellar spectra are continuous sequential signals, enabling the transfer of language models to stellar spectroscopy. Here, we propose a two-stage large language model framework for stellar parameter inference, achieving accurate estimation of effective temperature, surface gravity, metallicity, and abundances of ~20 chemical elements. Scaling-law analyses show systematic performance improvements with increasing data, providing a scalable framework for forthcoming large-scale surveys.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes