NELGMar 15, 2022

Neural-Network-Directed Genetic Programmer for Discovery of Governing Equations

arXiv:2203.08808v16 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work addresses the challenge of discovering governing equations from experimental data, such as in ligand-receptor kinetics and omics data, which is incremental as it builds on existing evolutionary and symbolic regression methods.

The authors tackled the problem of extracting governing mathematical expressions from observed data by developing a symbolic regression framework called faiGP, which uses a genetic programmer and neural networks to generate symbolically equivalent expressions or approximations, with performance quantified through regularizers like diversity and complexity metrics.

We develop a symbolic regression framework for extracting the governing mathematical expressions from observed data. The evolutionary approach, faiGP, is designed to leverage the properties of a function algebra that have been encoded into a grammar, providing a theoretical guarantee of universal approximation and a way to minimize bloat. In this framework, the choice of operators of the grammar may be informed by a physical theory or symmetry considerations. Since there is currently no theory that can derive the 'constants of nature', an empirical investigation on extracting these coefficients from an evolutionary process is of methodological interest. We quantify the impact of different types of regularizers, including a diversity metric adapted from studies of the transcriptome and a complexity measure, on the performance of the framework. Our implementation, which leverages neural networks and a genetic programmer, generates non-trivial symbolically equivalent expressions ("Ramanujan expressions") or approximations with potentially interesting numerical applications. To illustrate the framework, a model of ligand-receptor binding kinetics, including an account of gene regulation by transcription factors, and a model of the regulatory range of the cistrome from omics data are presented. This study has important implications on the development of data-driven methodologies for the discovery of governing equations in experimental data derived from new sensing systems and high-throughput screening technologies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes