LG AIJan 10, 2024

Dimensionality-Aware Outlier Detection: Theoretical and Experimental Analysis

Alastair Anderberg, James Bailey, Ricardo J. G. B. Campello, Michael E. Houle, Henrique O. Marques, Miloš Radovanović, Arthur Zimek

arXiv:2401.05453v22.63 citationsh-index: 46Has CodeSDM

Originality Incremental advance

AI Analysis

This addresses the problem of improving outlier detection accuracy for data analysts, though it appears incremental as it builds on existing LID theory.

The paper tackled outlier detection by accounting for local intrinsic dimensionality variations, resulting in a method (DAO) that significantly outperformed benchmark methods like LOF and kNN across over 800 datasets.

We present a nonparametric method for outlier detection that takes full account of local variations in intrinsic dimensionality within the dataset. Using the theory of Local Intrinsic Dimensionality (LID), our 'dimensionality-aware' outlier detection method, DAO, is derived as an estimator of an asymptotic local expected density ratio involving the query point and a close neighbor drawn at random. The dimensionality-aware behavior of DAO is due to its use of local estimation of LID values in a theoretically-justified way. Through comprehensive experimentation on more than 800 synthetic and real datasets, we show that DAO significantly outperforms three popular and important benchmark outlier detection methods: Local Outlier Factor (LOF), Simplified LOF, and kNN.

View on arXiv PDF Code

Similar