Quantifying the Reconstructability of Astrophysical Methods with Large Language Models and Information Theory: A Case Study in Spectral Reconstruction

arXiv:2605.1115411.3

Predicted impact top 78% in IM · last 90 daysOriginality Incremental advance

AI Analysis

For researchers and publishers, this provides a diagnostic tool to audit methodological transparency and identify missing details in scientific papers.

This paper introduces an information-theoretic framework using LLMs to quantify how well a scientific method can be reconstructed from its written description. Applied to TNO spectral reconstruction, they find that while text clarifies algorithmic structure, an 'entropy floor' of implementation variance persists, and LLMs fail to infer tacit expert knowledge needed for strict calibration.

Modern astrophysical studies rely heavily on complex data analysis pipelines; however, published descriptions often lack the detail required for computational reproducibility. In this work, we present an information-theoretic framework to quantify how effectively a method can be reconstructed from its written description. By treating algorithmic reconstruction as a probability distribution generated by Large Language Models (LLMs), we utilize Shannon entropy and Jensen-Shannon divergence to measure how strongly text constrains the hypothesis space of valid implementations. We demonstrate this approach through a case study of Trans-Neptunian Object (TNO) spectral reconstruction from sparse photometry. By prompting frontier LLMs with varying levels of manuscript text (Title, Abstract, and Methods), we find that while increasing text successfully clarifies the overall algorithmic structure, it fails to eliminate variance at the implementation level. This persistent variance establishes an "entropy floor," demonstrating that multiple divergent implementations remain consistent with explicit instructions. To evaluate practical reproducibility, we convert these reconstructed algorithms into executable pipelines. Our results reveal that, while LLMs easily recover core functional methodologies, they systematically fail to infer the tacit expert knowledge required for strict scientific calibration. This pilot study demonstrates that LLMs can be repurposed as a zero-shot diagnostic tool to audit methodological transparency, helping authors identify missing structural constraints and preserve scientific integrity in an era of automated research.

View on arXiv PDF

Similar