LGDec 7, 2023

A Transformer Model for Symbolic Regression towards Scientific Discovery

arXiv:2312.04070v28 citationsh-index: 12
AI Analysis

This work addresses the efficiency problem in symbolic regression for researchers in scientific fields, though it appears incremental as it builds on existing Transformer architectures.

The authors tackled the computational expense of symbolic regression algorithms by proposing a new Transformer model specifically for scientific discovery applications, achieving state-of-the-art results on the SRSD datasets using normalized tree-based edit distance.

Symbolic Regression (SR) searches for mathematical expressions which best describe numerical datasets. This allows to circumvent interpretation issues inherent to artificial neural networks, but SR algorithms are often computationally expensive. This work proposes a new Transformer model aiming at Symbolic Regression particularly focused on its application for Scientific Discovery. We propose three encoder architectures with increasing flexibility but at the cost of column-permutation equivariance violation. Training results indicate that the most flexible architecture is required to prevent from overfitting. Once trained, we apply our best model to the SRSD datasets (Symbolic Regression for Scientific Discovery datasets) which yields state-of-the-art results using the normalized tree-based edit distance, at no extra computational cost.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes