Streaming Active Deep Forest for Evolving Data Stream Classification
This work addresses the challenge of efficient stream classification for data stream applications, representing an incremental adaptation of Deep Forest to a new context.
The paper tackles the problem of adapting Deep Forest to evolving data streams by introducing the Streaming Deep Forest algorithm and an Augmented Variable Uncertainty active learning strategy, resulting in SDF with only 70% labeling budget significantly outperforming other methods trained with all instances.
In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite of its huge success in the batch learning setting, no effort has been made to adapt Deep Forest to the context of evolving data streams. In this work, we introduce the Streaming Deep Forest (SDF) algorithm, a high-performance deep ensemble method specially adapted to stream classification. We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context. We compare the proposed methods to state-of-the-art streaming algorithms in a wide range of datasets. The results show that by following the AVU active learning strategy, SDF with only 70\% of labeling budget significantly outperforms other methods trained with all instances.