LGJan 13, 2017

Symbolic Regression Algorithms with Built-in Linear Regression

arXiv:1701.03641v311 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for a systematic evaluation of faster symbolic regression methods for researchers and practitioners in machine learning, though it is incremental as it compares existing algorithms.

The paper systematically compares several symbolic regression algorithms that incorporate linear regression (GPTIPS, FFX, EFS) on synthetic and real-world benchmarks, finding they can be faster than traditional genetic programming methods, and relates their performance to conventional machine learning algorithms like multiple regression, random forests, and support vector regression.

Recently, several algorithms for symbolic regression (SR) emerged which employ a form of multiple linear regression (LR) to produce generalized linear models. The use of LR allows the algorithms to create models with relatively small error right from the beginning of the search; such algorithms are thus claimed to be (sometimes by orders of magnitude) faster than SR algorithms based on vanilla genetic programming. However, a systematic comparison of these algorithms on a common set of problems is still missing. In this paper we conceptually and experimentally compare several representatives of such algorithms (GPTIPS, FFX, and EFS). They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic and real-world SR benchmark problems. Their performance is also related to the performance of three conventional machine learning algorithms --- multiple regression, random forests and support vector regression.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes