NELGMLMar 20, 2017

Ensemble representation learning: an analysis of fitness and survival for wrapper-based genetic programming methods

arXiv:1703.06934v313 citations
Originality Incremental advance
AI Analysis

This work addresses feature engineering for classification problems, offering incremental improvements in wrapper-based genetic programming methods.

The authors adapted an ensemble-based feature engineering wrapper (FEW) for supervised classification, analyzing fitness and survival methods, and found that specific fitness metrics and ε-lexicase survival outperformed common alternatives, with FEW improving classifier performance in several cases.

Recently we proposed a general, ensemble-based feature engineering wrapper (FEW) that was paired with a number of machine learning methods to solve regression problems. Here, we adapt FEW for supervised classification and perform a thorough analysis of fitness and survival methods within this framework. Our tests demonstrate that two fitness metrics, one introduced as an adaptation of the silhouette score, outperform the more commonly used Fisher criterion. We analyze survival methods and demonstrate that $ε$-lexicase survival works best across our test problems, followed by random survival which outperforms both tournament and deterministic crowding. We conduct a benchmark comparison to several classification methods using a large set of problems and show that FEW can improve the best classifier performance in several cases. We show that FEW generates consistent, meaningful features for a biomedical problem with different ML pairings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes