NE LG MLMar 20, 2017

Ensemble representation learning: an analysis of fitness and survival for wrapper-based genetic programming methods

arXiv:1703.06934v34.113 citations

Originality Incremental advance

AI Analysis

This work addresses feature engineering for classification problems, offering incremental improvements in wrapper-based genetic programming methods.

The authors adapted an ensemble-based feature engineering wrapper (FEW) for supervised classification, analyzing fitness and survival methods, and found that specific fitness metrics and ε-lexicase survival outperformed common alternatives, with FEW improving classifier performance in several cases.

Recently we proposed a general, ensemble-based feature engineering wrapper (FEW) that was paired with a number of machine learning methods to solve regression problems. Here, we adapt FEW for supervised classification and perform a thorough analysis of fitness and survival methods within this framework. Our tests demonstrate that two fitness metrics, one introduced as an adaptation of the silhouette score, outperform the more commonly used Fisher criterion. We analyze survival methods and demonstrate that $ε$-lexicase survival works best across our test problems, followed by random survival which outperforms both tournament and deterministic crowding. We conduct a benchmark comparison to several classification methods using a large set of problems and show that FEW can improve the best classifier performance in several cases. We show that FEW generates consistent, meaningful features for a biomedical problem with different ML pairings.

View on arXiv PDF

Similar