Pruning Techniques for Mixed Ensembles of Genetic Programming Models
This work addresses a gap in the evolutionary computation community for building ensembles of GP models, offering incremental improvements in generalization and efficiency for complex problems.
The paper tackles the problem of building effective ensembles of Genetic Programming models by proposing a strategy that blends syntax-based and semantics-based GP individuals and introduces pruning criteria based on correlation and entropy. Experimental results show that these pruning criteria improve generalization ability and reduce computational burden for the ensemble model.
The objective of this paper is to define an effective strategy for building an ensemble of Genetic Programming (GP) models. Ensemble methods are widely used in machine learning due to their features: they average out biases, they reduce the variance and they usually generalize better than single models. Despite these advantages, building ensemble of GP models is not a well-developed topic in the evolutionary computation community. To fill this gap, we propose a strategy that blends individuals produced by standard syntax-based GP and individuals produced by geometric semantic genetic programming, one of the newest semantics-based method developed in GP. In fact, recent literature showed that combining syntax and semantics could improve the generalization ability of a GP model. Additionally, to improve the diversity of the GP models used to build up the ensemble, we propose different pruning criteria that are based on correlation and entropy, a commonly used measure in information theory. Experimental results,obtained over different complex problems, suggest that the pruning criteria based on correlation and entropy could be effective in improving the generalization ability of the ensemble model and in reducing the computational burden required to build it.