LGMLFeb 22, 2019

Diversity of Ensembles for Data Stream Classification

arXiv:1902.08466v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of improving ensemble performance in data stream classification for researchers and practitioners, but it is incremental as it builds on existing static-data studies without introducing new methods.

The paper tackles the lack of research on diversity measures in evolving data streams by providing a theoretical analysis of different diversity measures and their applicability to ensemble learning for streaming data with concept drift.

When constructing a classifier ensemble, diversity among the base classifiers is one of the important characteristics. Several studies have been made in the context of standard static data, in particular, when analyzing the relationship between a high ensemble predictive performance and the diversity of its components. Besides, ensembles of learning machines have been performed to learn in the presence of concept drift and adapt to it. However, diversity measures have not received much research interest in evolving data streams. Only a few researchers directly consider promoting diversity while constructing an ensemble or rebuilding them in the moment of detecting drifts. In this paper, we present a theoretical analysis of different diversity measures and relate them to the success of ensemble learning algorithms for streaming data. The analysis provides a deeper understanding of the concept of diversity and its impact on online ensemble Learning in the presence of concept drift. More precisely, we are interested in answering the following research question; Which commonly used diversity measures are used in the context of static-data ensembles and how far are they applicable in the context of streaming data ensembles?

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes