Learning from Imbalanced Multiclass Sequential Data Streams Using Dynamically Weighted Conditional Random Fields
This work addresses data imbalance in healthcare activity recognition using body-worn sensors, but it is incremental as it builds on existing CRF methods with dynamic weighting.
The study tackled the problem of class imbalance in sequential activity recognition from body-worn sensors by proposing a dynamically weighted conditional random field (dWCRF) method, which improved overall and minority class F-scores compared to other CRF-based classifiers and matched or surpassed SVM-based classifiers with limited training data.
The present study introduces a method for improving the classification performance of imbalanced multiclass data streams from wireless body worn sensors. Data imbalance is an inherent problem in activity recognition caused by the irregular time distribution of activities, which are sequential and dependent on previous movements. We use conditional random fields (CRF), a graphical model for structured classification, to take advantage of dependencies between activities in a sequence. However, CRFs do not consider the negative effects of class imbalance during training. We propose a class-wise dynamically weighted CRF (dWCRF) where weights are automatically determined during training by maximizing the expected overall F-score. Our results based on three case studies from a healthcare application using a batteryless body worn sensor, demonstrate that our method, in general, improves overall and minority class F-score when compared to other CRF based classifiers and achieves similar or better overall and class-wise performance when compared to SVM based classifiers under conditions of limited training data. We also confirm the performance of our approach using an additional battery powered body worn sensor dataset, achieving similar results in cases of high class imbalance.