LGAIITCOMP-PHMLJun 18, 2020

AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

arXiv:2006.10782v2251 citations
Originality Incremental advance
AI Analysis

This improves symbolic regression for scientific discovery and data analysis, though it appears incremental as an enhancement to prior work.

The paper tackles symbolic regression by developing a method that finds Pareto-optimal formulas for accuracy and complexity, resulting in orders of magnitude more robustness to noise and discovery of previously unsolvable formulas.

We present an improved method for symbolic regression that seeks to fit data to formulas that are Pareto-optimal, in the sense of having the best accuracy for a given complexity. It improves on the previous state-of-the-art by typically being orders of magnitude more robust toward noise and bad data, and also by discovering many formulas that stumped previous methods. We develop a method for discovering generalized symmetries (arbitrary modularity in the computational graph of a formula) from gradient properties of a neural network fit. We use normalizing flows to generalize our symbolic regression method to probability distributions from which we only have samples, and employ statistical hypothesis testing to accelerate robust brute-force search.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes