DenDrift: A Drift-Aware Algorithm for Host Profiling
This addresses the challenge of detecting unauthorized actions in security monitoring for large-scale host systems, but it is incremental as it builds on existing DenStream with added drift detection.
The paper tackles the problem of unreliable host profiling due to concept drift in stream clustering by proposing DenDrift, a drift-aware algorithm based on DenStream, which shows robustness against abrupt, gradual, and incremental drifts in experiments on synthetic and industrial datasets.
Detecting and reacting to unauthorized actions is an essential task in security monitoring. What make this task challenging are the large number and various categories of hosts and processes to monitor. To these we should add the lack of an exact definition of normal behavior for each category. Host profiling using stream clustering algorithms is an effective means of analyzing hosts' behaviors, categorizing them, and identifying atypical ones. However, unforeseen changes in behavioral data (i.e. concept drift) make the obtained profiles unreliable. DenStream is a well-known stream clustering algorithm, which can be effectively used for host profiling. This algorithm is an incremental extension of DBSCAN which is a non-parametric algorithm widely used in real-world clustering applications. Recent experimental studies indicate that DenStream is not robust against concept drift. In this paper, we present DenDrift as a drift-aware host profiling algorithm based on DenStream. DenDrift relies on non-negative matrix factorization for dimensionality reduction and Page-Hinckley test for drift detection. We have done experiments on both synthetic and industrial datasets and the results affirm the robustness of DenDrift against abrupt, gradual and incremental drifts.