CLHIST-PHNov 22, 2024

Astro-HEP-BERT: A bidirectional language model for studying the meanings of concepts in astrophysics and high energy physics

arXiv:2411.14877v11 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This provides a cost-effective tool for researchers in the history, philosophy, and sociology of science to analyze scientific concepts without extensive training resources.

The authors tackled the problem of studying concept meanings in astrophysics and high-energy physics by developing Astro-HEP-BERT, a transformer-based language model trained on 21.84 million paragraphs from arXiv articles, which performs comparably to models trained from scratch on larger datasets for tasks like word sense disambiguation.

I present Astro-HEP-BERT, a transformer-based language model specifically designed for generating contextualized word embeddings (CWEs) to study the meanings of concepts in astrophysics and high-energy physics. Built on a general pretrained BERT model, Astro-HEP-BERT underwent further training over three epochs using the Astro-HEP Corpus, a dataset I curated from 21.84 million paragraphs extracted from more than 600,000 scholarly articles on arXiv, all belonging to at least one of these two scientific domains. The project demonstrates both the effectiveness and feasibility of adapting a bidirectional transformer for applications in the history, philosophy, and sociology of science (HPSS). The entire training process was conducted using freely available code, pretrained weights, and text inputs, completed on a single MacBook Pro Laptop (M2/96GB). Preliminary evaluations indicate that Astro-HEP-BERT's CWEs perform comparably to domain-adapted BERT models trained from scratch on larger datasets for domain-specific word sense disambiguation and induction and related semantic change analyses. This suggests that retraining general language models for specific scientific domains can be a cost-effective and efficient strategy for HPSS researchers, enabling high performance without the need for extensive training from scratch.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes