LGAISCJan 15, 2023

Symbolic expression generation via Variational Auto-Encoder

arXiv:2301.06064v19 citationsh-index: 105
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable solutions in fields like physics and biology, offering a novel method for symbolic regression that is incremental in improving performance under noise.

The authors tackled the problem of generating interpretable symbolic expressions from data, proposing a variational autoencoder framework that outperforms existing methods under noisy conditions, achieving a 65% recovery rate on the Nguyen dataset with 10% noise, which is 20% better than previous state-of-the-art.

There are many problems in physics, biology, and other natural sciences in which symbolic regression can provide valuable insights and discover new laws of nature. A widespread Deep Neural Networks do not provide interpretable solutions. Meanwhile, symbolic expressions give us a clear relation between observations and the target variable. However, at the moment, there is no dominant solution for the symbolic regression task, and we aim to reduce this gap with our algorithm. In this work, we propose a novel deep learning framework for symbolic expression generation via variational autoencoder (VAE). In a nutshell, we suggest using a VAE to generate mathematical expressions, and our training strategy forces generated formulas to fit a given dataset. Our framework allows encoding apriori knowledge of the formulas into fast-check predicates that speed up the optimization process. We compare our method to modern symbolic regression benchmarks and show that our method outperforms the competitors under noisy conditions. The recovery rate of SEGVAE is 65% on the Ngyuen dataset with a noise level of 10%, which is better than the previously reported SOTA by 20%. We demonstrate that this value depends on the dataset and can be even higher.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes