OccamNet: A Fast Neural Model for Symbolic Regression at Scale
This addresses the need for interpretable and efficient symbolic regression in scientific data analysis, representing a novel method for a known bottleneck.
The authors tackled the problem of neural networks being black-box models that extrapolate poorly by introducing OccamNet, a neural model for symbolic regression that finds interpretable, compact symbolic fits to data, outperforming state-of-the-art methods on real-world datasets and fitting functions in minutes on a CPU with GPU scalability.
Neural networks' expressiveness comes at the cost of complex, black-box models that often extrapolate poorly beyond the domain of the training dataset, conflicting with the goal of finding compact analytic expressions to describe scientific data. We introduce OccamNet, a neural network model that finds interpretable, compact, and sparse symbolic fits to data, à la Occam's razor. Our model defines a probability distribution over functions with efficient sampling and function evaluation. We train by sampling functions and biasing the probability mass toward better fitting solutions, backpropagating using cross-entropy matching in a reinforcement-learning loss. OccamNet can identify symbolic fits for a variety of problems, including analytic and non-analytic functions, implicit functions, and simple image classification, and can outperform state-of-the-art symbolic regression methods on real-world regression datasets. Our method requires a minimal memory footprint, fits complicated functions in minutes on a single CPU, and scales on a GPU.