DynED: Dynamic Ensemble Diversification in Data Stream Classification
This work addresses the challenge of concept drift in data stream classification for applications requiring real-time adaptation, though it is incremental as it builds on existing ensemble and diversity methods.
The paper tackles the problem of maintaining high accuracy in data stream classification under concept drift by proposing DynED, a dynamic ensemble construction method based on Maximal Marginal Relevance (MMR) that selects components for high performance and diversity. Experimental results on 15 datasets show DynED achieves higher average mean accuracy than five state-of-the-art baselines.
Ensemble methods are commonly used in classification due to their remarkable performance. Achieving high accuracy in a data stream environment is a challenging task considering disruptive changes in the data distribution, also known as concept drift. A greater diversity of ensemble components is known to enhance prediction accuracy in such settings. Despite the diversity of components within an ensemble, not all contribute as expected to its overall performance. This necessitates a method for selecting components that exhibit high performance and diversity. We present a novel ensemble construction and maintenance approach based on MMR (Maximal Marginal Relevance) that dynamically combines the diversity and prediction accuracy of components during the process of structuring an ensemble. The experimental results on both four real and 11 synthetic datasets demonstrate that the proposed approach (DynED) provides a higher average mean accuracy compared to the five state-of-the-art baselines.