Dimensionality-Aware Outlier Detection: Theoretical and Experimental Analysis
This addresses the problem of improving outlier detection accuracy for data analysts, though it appears incremental as it builds on existing LID theory.
The paper tackled outlier detection by accounting for local intrinsic dimensionality variations, resulting in a method (DAO) that significantly outperformed benchmark methods like LOF and kNN across over 800 datasets.
We present a nonparametric method for outlier detection that takes full account of local variations in intrinsic dimensionality within the dataset. Using the theory of Local Intrinsic Dimensionality (LID), our 'dimensionality-aware' outlier detection method, DAO, is derived as an estimator of an asymptotic local expected density ratio involving the query point and a close neighbor drawn at random. The dimensionality-aware behavior of DAO is due to its use of local estimation of LID values in a theoretically-justified way. Through comprehensive experimentation on more than 800 synthetic and real datasets, we show that DAO significantly outperforms three popular and important benchmark outlier detection methods: Local Outlier Factor (LOF), Simplified LOF, and kNN.