An Efficient Variant of One-Class SVM with Lifelong Online Learning Guarantees
This work addresses outlier detection for streaming data with distribution shifts, offering incremental improvements in efficiency and error rates for applications like anomaly monitoring.
The paper tackles outlier detection for non-stationary streaming data by introducing SONAR, an efficient SGD-based One-Class SVM solver, which reduces computational cost and achieves lower Type I/II errors compared to traditional methods, with theoretical guarantees validated on synthetic and real-world datasets.
We study outlier (a.k.a., anomaly) detection for single-pass non-stationary streaming data. In the well-studied offline or batch outlier detection problem, traditional methods such as kernel One-Class SVM (OCSVM) are both computationally heavy and prone to large false-negative (Type II) errors under non-stationarity. To remedy this, we introduce SONAR, an efficient SGD-based OCSVM solver with strongly convex regularization. We show novel theoretical guarantees on the Type I/II errors of SONAR, superior to those known for OCSVM, and further prove that SONAR ensures favorable lifelong learning guarantees under benign distribution shifts. In the more challenging problem of adversarial non-stationary data, we show that SONAR can be used within an ensemble method and equipped with changepoint detection to achieve adaptive guarantees, ensuring small Type I/II errors on each phase of data. We validate our theoretical findings on synthetic and real-world datasets.