LGMay 17, 2023

Incremental Outlier Detection Modelling Using Streaming Analytics in Finance & Health Care

arXiv:2305.09907v2
Originality Synthesis-oriented
AI Analysis

This work addresses real-time outlier detection for finance and healthcare applications, but it is incremental as it combines existing methods in a hybrid framework.

The paper tackled the problem of outlier detection in streaming data by proposing a hybrid framework with incremental learning, which significantly improved performance on imbalanced datasets, with the IForest ASD model consistently ranking among the top three across financial and healthcare tasks.

In the era of real-time data, traditional methods often struggle to keep pace with the dynamic nature of streaming environments. In this paper, we proposed a hybrid framework where in (i) stage-I follows a traditional approach where the model is built once and evaluated in a real-time environment, and (ii) stage-II employs an incremental learning approach where the model is continuously retrained as new data arrives, enabling it to adapt and stay up to date. To implement these frameworks, we employed 8 distinct state-of-the-art outlier detection models, including one-class support vector machine (OCSVM), isolation forest adaptive sliding window approach (IForest ASD), exact storm (ES), angle-based outlier detection (ABOD), local outlier factor (LOF), Kitsunes online algorithm (KitNet), and K-nearest neighbour conformal density and distance based (KNN CAD). We evaluated the performance of these models across seven financial and healthcare prediction tasks, including credit card fraud detection, churn prediction, Ethereum fraud detection, heart stroke prediction, and diabetes prediction. The results indicate that our proposed incremental learning framework significantly improves performance, particularly on highly imbalanced datasets. Among all models, the IForest ASD model consistently ranked among the top three best-performing models, demonstrating superior effectiveness across various datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes