LGMar 23

Neural Structure Embedding for Symbolic Regression via Continuous Structure Search and Coefficient Optimization

arXiv:2603.2242920.9h-index: 3
Predicted impact top 88% in LG · last 90 daysOriginality Highly original
AI Analysis

This work addresses the computational inefficiency and instability of discrete search methods in symbolic regression, offering a new paradigm for researchers and practitioners in scientific discovery and data analysis.

The paper tackles the problem of symbolic regression by proposing SRCO, a framework that embeds symbolic structures into a continuous space for efficient search and optimization, resulting in improved equation accuracy, robustness, and search efficiency compared to state-of-the-art methods.

Symbolic regression aims to discover human-interpretable equations that explain observational data. However, existing approaches rely heavily on discrete structure search (e.g., genetic programming), which often leads to high computational cost, unstable performance, and limited scalability to large equation spaces. To address these challenges, we propose SRCO, a unified embedding-driven framework for symbolic regression that transforms symbolic structures into a continuous, optimizable representation space. The framework consists of three key components: (1) structure embedding: we first generate a large pool of exploratory equations using traditional symbolic regression algorithms and train a Transformer model to compress symbolic structures into a continuous embedding space; (2) continuous structure search: the embedding space enables efficient exploration using gradient-based or sampling-based optimization, significantly reducing the cost of navigating the combinatorial structure space; and (3) coefficient optimization: for each discovered structure, we treat symbolic coefficients as learnable parameters and apply gradient optimization to obtain accurate numerical values. Experiments on synthetic and real-world datasets show that our approach consistently outperforms state-of-the-art methods in equation accuracy, robustness, and search efficiency. This work introduces a new paradigm for symbolic regression by bridging symbolic equation discovery with continuous embedding learning and optimization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes