LG AI SCMay 24, 2023

RSRM: Reinforcement Symbolic Regression Machine

arXiv:2305.14656v19.87 citations

Originality Incremental advance

AI Analysis

It addresses a grand challenge in symbolic regression for fields like science and engineering, though it appears incremental as it builds on existing methods to overcome specific bottlenecks.

The paper tackles the challenge of automatically discovering parsimonious mathematical equations from limited data in symbolic regression, particularly when the search space is infinite and formulas are intricate, and proposes RSRM, which achieves state-of-the-art performance on benchmark examples.

In nature, the behaviors of many complex systems can be described by parsimonious math equations. Automatically distilling these equations from limited data is cast as a symbolic regression process which hitherto remains a grand challenge. Keen efforts in recent years have been placed on tackling this issue and demonstrated success in symbolic regression. However, there still exist bottlenecks that current methods struggle to break when the discrete search space tends toward infinity and especially when the underlying math formula is intricate. To this end, we propose a novel Reinforcement Symbolic Regression Machine (RSRM) that masters the capability of uncovering complex math equations from only scarce data. The RSRM model is composed of three key modules: (1) a Monte Carlo tree search (MCTS) agent that explores optimal math expression trees consisting of pre-defined math operators and variables, (2) a Double Q-learning block that helps reduce the feasible search space of MCTS via properly understanding the distribution of reward, and (3) a modulated sub-tree discovery block that heuristically learns and defines new math operators to improve representation ability of math expression trees. Biding of these modules yields the state-of-the-art performance of RSRM in symbolic regression as demonstrated by multiple sets of benchmark examples. The RSRM model shows clear superiority over several representative baseline models.

View on arXiv PDF

Similar