LGSep 28, 2021

Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication

arXiv:2109.13895v133 citations
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable and trustworthy system identification in industrial applications, though it is incremental as it builds on existing symbolic regression techniques with deterministic improvements.

The authors tackled the problem of symbolic regression in industrial scenarios requiring interpretable and robust models by introducing a deterministic algorithm that exhaustively searches the space using syntactical constraints and semantic deduplication. The result is a method competitive with genetic programming on noiseless benchmarks while producing simple, reliable, and reproducible models.

Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic programming for symbolic regression. In this chapter we introduce a deterministic symbolic regression algorithm specifically designed to address these issues. The algorithm uses a context-free grammar to produce models that are parameterized by a non-linear least squares local optimization procedure. A finite enumeration of all possible models is guaranteed by structural restrictions as well as a caching mechanism for detecting semantically equivalent solutions. Enumeration order is established via heuristics designed to improve search efficiency. Empirical tests on a comprehensive benchmark suite show that our approach is competitive with genetic programming in many noiseless problems while maintaining desirable properties such as simple, reliable models and reproducibility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes