Local Subspace-Based Outlier Detection using Global Neighbourhoods
This addresses outlier detection for applications like fraud detection and quality control, but it appears incremental as it builds on existing density-based methods by incorporating global perspectives.
The paper tackles outlier detection in high-dimensional mixed data by introducing GLOSS, an algorithm that uses global neighborhoods for local subspace outlier detection, and experiments show it more accurately detects local outliers in synthetic data and identifies relevant outliers overlooked by existing methods in real-world data.
Outlier detection in high-dimensional data is a challenging yet important task, as it has applications in, e.g., fraud detection and quality control. State-of-the-art density-based algorithms perform well because they 1) take the local neighbourhoods of data points into account and 2) consider feature subspaces. In highly complex and high-dimensional data, however, existing methods are likely to overlook important outliers because they do not explicitly take into account that the data is often a mixture distribution of multiple components. We therefore introduce GLOSS, an algorithm that performs local subspace outlier detection using global neighbourhoods. Experiments on synthetic data demonstrate that GLOSS more accurately detects local outliers in mixed data than its competitors. Moreover, experiments on real-world data show that our approach identifies relevant outliers overlooked by existing methods, confirming that one should keep an eye on the global perspective even when doing local outlier detection.