ML LGDec 30, 2019

Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles

Martin Binder, Julia Moosbauer, Janek Thomas, Bernd Bischl

arXiv:1912.12912v211.311 citations

Originality Incremental advance

AI Analysis

This addresses the need for efficient and interpretable machine learning models by simultaneously optimizing hyperparameters and features, though it is incremental in combining existing multi-objective methods with filter ensembles.

The paper tackles the joint optimization of hyperparameter tuning and feature selection as a multi-objective problem, balancing predictive performance and model sparsity, and benchmarks two approaches: model-based optimization and an NSGA-II-based wrapper, finding that model-based optimization requires fewer evaluations but has higher computational overhead.

Both feature selection and hyperparameter tuning are key tasks in machine learning. Hyperparameter tuning is often useful to increase model performance, while feature selection is undertaken to attain sparse models. Sparsity may yield better model interpretability and lower cost of data acquisition, data handling and model inference. While sparsity may have a beneficial or detrimental effect on predictive performance, a small drop in performance may be acceptable in return for a substantial gain in sparseness. We therefore treat feature selection as a multi-objective optimization task. We perform hyperparameter tuning and feature selection simultaneously because the choice of features of a model may influence what hyperparameters perform well. We present, benchmark, and compare two different approaches for multi-objective joint hyperparameter optimization and feature selection: The first uses multi-objective model-based optimization. The second is an evolutionary NSGA-II-based wrapper approach to feature selection which incorporates specialized sampling, mutation and recombination operators. Both methods make use of parameterized filter ensembles. While model-based optimization needs fewer objective evaluations to achieve good performance, it incurs computational overhead compared to the NSGA-II, so the preferred choice depends on the cost of evaluating a model on given data.

View on arXiv PDF

Similar