IM CO GA HE CL LGSep 12, 2023

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciucă, Charlie O'Neill, Ze-Chang Sun, Maja Jabłońska, Sandor Kruk, Ernest Perkowski, Jack Miller, Jason Li, Josh Peek, Kartheik Iyer

arXiv:2309.06126v133.0140 citationsh-index: 74

Originality Incremental advance

AI Analysis

This provides a domain-specific foundation model for astronomy researchers, though it is incremental as it builds on existing fine-tuning methods.

The authors tackled the problem of large language models underperforming in specialized domains like astronomy by introducing AstroLLaMA, a 7-billion-parameter model fine-tuned on astronomy abstracts, which achieved a 30% lower perplexity than LLaMA-2 and generated more scientifically relevant text.

Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.

View on arXiv PDF

Similar