LGNov 6, 2022

Synthetic Data for Feature Selection

arXiv:2211.03035v14 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This provides a common reference point for researchers to evaluate feature selection algorithms, but it is incremental as it applies existing methods to new data.

The paper introduces a collection of synthetic datasets for feature selection, based on electronics applications to mimic real scenarios, and demonstrates their utility by testing popular algorithms, with the datasets made publicly available on GitHub.

Feature selection is an important and active field of research in machine learning and data science. Our goal in this paper is to propose a collection of synthetic datasets that can be used as a common reference point for feature selection algorithms. Synthetic datasets allow for precise evaluation of selected features and control of the data parameters for comprehensive assessment. The proposed datasets are based on applications from electronics in order to mimic real life scenarios. To illustrate the utility of the proposed data we employ one of the datasets to test several popular feature selection algorithms. The datasets are made publicly available on GitHub and can be used by researchers to evaluate feature selection algorithms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes