LGIRSep 15, 2021

On-the-Fly Ensemble Pruning in Evolving Data Streams

arXiv:2109.07611v14 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient ensemble management for data stream classification, particularly in imbalanced scenarios, representing an incremental advance in a largely unexplored area.

The paper tackles the problem of ensemble pruning in evolving data streams by proposing CCRP, an on-the-fly method that uses imbalance-aware fusion of class-wise rankings to select the best classifiers per class, resulting in performance on par or superior to original ensembles with 20% to 90% less memory consumption.

Ensemble pruning is the process of selecting a subset of componentclassifiers from an ensemble which performs at least as well as theoriginal ensemble while reducing storage and computational costs.Ensemble pruning in data streams is a largely unexplored area ofresearch. It requires analysis of ensemble components as they arerunning on the stream, and differentiation of useful classifiers fromredundant ones. We present CCRP, an on-the-fly ensemble prun-ing method for multi-class data stream classification empoweredby an imbalance-aware fusion of class-wise component rankings.CCRP aims that the resulting pruned ensemble contains the bestperforming classifier for each target class and hence, reduces the ef-fects of class imbalance. The conducted experiments on real-worldand synthetic data streams demonstrate that different types of en-sembles that integrate CCRP as their pruning scheme consistentlyyield on par or superior performance with 20% to 90% less averagememory consumption. Lastly, we validate the proposed pruningscheme by comparing our approach against pruning schemes basedon ensemble weights and basic rank fusion methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes