Active Learning in Symbolic Regression with Physical Constraints
This work addresses data efficiency for researchers using symbolic regression in scientific domains, but it is incremental as it builds on existing methods like query by committee.
The paper tackles the problem of reducing data requirements in symbolic regression by integrating active learning with physical constraints, achieving state-of-the-art results in rediscovering known equations with less data.
Evolutionary symbolic regression (SR) fits a symbolic equation to data, which gives a concise interpretable model. We explore using SR as a method to propose which data to gather in an active learning setting with physical constraints. SR with active learning proposes which experiments to do next. Active learning is done with query by committee, where the Pareto frontier of equations is the committee. The physical constraints improve proposed equations in very low data settings. These approaches reduce the data required for SR and achieves state of the art results in data required to rediscover known equations.