Robust Isolation Forest using Soft Sparse Random Projection and Valley Emphasis Method
This work addresses stability and robustness issues in anomaly detection for applications requiring reliable performance across diverse datasets, though it is incremental as it builds on existing Isolation Forest methods.
The paper tackles the problem of inconsistent performance and difficulty in isolating rare anomalies in Isolation Forest by introducing Robust iForest (RiForest), which uses soft sparse random projection and valley emphasis to improve splits, resulting in consistent outperformance across 24 benchmark datasets.
Isolation Forest (iForest) is an unsupervised anomaly detection algorithm designed to effectively detect anomalies under the assumption that anomalies are ``few and different." Various studies have aimed to enhance iForest, but the resulting algorithms often exhibited significant performance disparities across datasets. Additionally, the challenge of isolating rare and widely distributed anomalies persisted in research focused on improving splits. To address these challenges, we introduce Robust iForest (RiForest). RiForest leverages both existing features and random hyperplanes obtained through soft sparse random projection to identify superior split features for anomaly detection, independent of datasets. It utilizes the underutilized valley emphasis method for optimal split point determination and incorporates sparsity randomization in soft sparse random projection for enhanced anomaly detection robustness. Across 24 benchmark datasets, experiments demonstrate RiForest's consistent outperformance of existing algorithms in anomaly detection, emphasizing stability and robustness to noise variables.