LGFeb 1, 2022

Weighted Isolation and Random Cut Forest Algorithms for Anomaly Detection

arXiv:2202.01891v5
Originality Incremental advance
AI Analysis

This work addresses anomaly detection for time series applications, offering an incremental improvement by enhancing split value selection in forest-based algorithms.

The authors tackled the problem of anomaly detection in time series data by proposing new weighted isolation forest (WIF) and weighted random cut forest (WRCF) algorithms, which incorporate data density to determine split values, resulting in improved performance over existing methods like IF and RRCF, as validated through numerical examples.

Random cut forest (RCF) algorithms have been developed for anomaly detection, particularly in time series data. The RCF algorithm is an improved version of the isolation forest (IF) algorithm. Unlike the IF algorithm, the RCF algorithm can determine whether real-time input contains an anomaly by inserting the input into the constructed tree network. Various RCF algorithms, including Robust RCF (RRCF), have been developed, where the cutting procedure is adaptively chosen probabilistically. The RRCF algorithm demonstrates better performance than the IF algorithm, as dimension cuts are decided based on the geometric range of the data, whereas the IF algorithm randomly chooses dimension cuts. However, the overall data structure is not considered in both IF and RRCF, given that split values are chosen randomly. In this paper, we propose new IF and RCF algorithms, referred to as the weighted IF (WIF) and weighted RCF (WRCF) algorithms, respectively. Their split values are determined by considering the density of the given data. To introduce the WIF and WRCF, we first present a new geometric measure, a density measure, which is crucial for constructing the WIF and WRCF. We provide various mathematical properties of the density measure, accompanied by theorems that support and validate our claims through numerical examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes