Are you sure? A Comprehensive and Comprehensible Survey of Uncertainty Quantification in Symbolic Regression
For researchers and practitioners using symbolic regression, this survey identifies a critical gap in reliability and decision-making support, but it is an incremental review rather than a novel method.
This survey addresses the lack of uncertainty quantification (UQ) in symbolic regression (SR), reviewing current UQ methods (frequentist, Bayesian, model selection) and highlighting that UQ in SR is underexplored, motivating further research.
Symbolic regression (SR) is a class of methods that systematically explore the space of mathematical functions to discover models that accurately capture the underlying relationships in a dataset. Despite recent advances in the field, a lack of support for uncertainty quantification (UQ) limits its adoption in real-world decision processes. In regression analysis, UQ provides important information about the model reliability, which can both help to avoid overfitting by accounting for uncertainty in the data, and provide insights for decision-making. This survey is the first to clearly address this issue, with the objective of introducing essential UQ concepts and reviewing the current literature on UQ in SR, which can be broadly organized into three research directions: frequentist, Bayesian, and model selection. Despite its importance, UQ in SR is still underexplored, which motivates further research into reliable UQ methods for SR.