LGMLNov 23, 2020

Conjecturing-Based Discovery of Patterns in Data

arXiv:2011.11576v4
Originality Incremental advance
AI Analysis

This work provides a new method for discovering underlying feature relationships in data, which could be beneficial for researchers in various domains seeking to understand complex datasets.

This paper introduces a conjecturing machine that identifies feature relationships as nonlinear bounds for numerical features and boolean expressions for categorical features. The framework successfully recovers known relationships and suggests risk factors for COVID-19 outcomes that are confirmed in medical literature.

We propose the use of a conjecturing machine that suggests feature relationships in the form of bounds involving nonlinear terms for numerical features and boolean expressions for categorical features. The proposed Conjecturing framework recovers known nonlinear and boolean relationships among features from data. In both settings, true underlying relationships are revealed. We then compare the method to a previously-proposed framework for symbolic regression on the ability to recover equations that are satisfied among features in a dataset. The framework is then applied to patient-level data regarding COVID-19 outcomes to suggest possible risk factors that are confirmed in the medical literature.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes