COMP-PH IM AI LG HEP-PHOct 9, 2025

Iterated Agent for Symbolic Regression

Zhuo-Yang Song, Zeyu Cai, Shutao Zhang, Jiashen Wei, Jichen Pan, Shi Qiu, Qing-Hong Cao, Tie-Jiun Hou, Xiaohui Liu, Ming-xing Luo, Hua Xing Zhu

arXiv:2510.08317v13.33 citationsh-index: 3

Originality Highly original

AI Analysis

It addresses the problem of combinatorial explosion and overfitting in symbolic regression for scientific discovery, offering a novel method that improves interpretability and accuracy.

The paper tackles symbolic regression by introducing IdeaSearchFitter, a framework that uses LLMs as semantic operators in evolutionary search to generate interpretable models, achieving competitive performance on benchmarks like the Feynman Symbolic Regression Database and discovering compact models in high-energy physics applications.

Symbolic regression (SR), the automated discovery of mathematical expressions from data, is a cornerstone of scientific inquiry. However, it is often hindered by the combinatorial explosion of the search space and a tendency to overfit. Popular methods, rooted in genetic programming, explore this space syntactically, often yielding overly complex, uninterpretable models. This paper introduces IdeaSearchFitter, a framework that employs Large Language Models (LLMs) as semantic operators within an evolutionary search. By generating candidate expressions guided by natural-language rationales, our method biases discovery towards models that are not only accurate but also conceptually coherent and interpretable. We demonstrate IdeaSearchFitter's efficacy across diverse challenges: it achieves competitive, noise-robust performance on the Feynman Symbolic Regression Database (FSReD), outperforming several strong baselines; discovers mechanistically aligned models with good accuracy-complexity trade-offs on real-world data; and derives compact, physically-motivated parametrizations for Parton Distribution Functions in a frontier high-energy physics application. IdeaSearchFitter is a specialized module within our broader iterated agent framework, IdeaSearch, which is publicly available at https://www.ideasearch.cn/.

View on arXiv PDF

Similar