Detecting Point Outliers Using Prune-based Outlier Factor (PLOF)
This addresses a bottleneck in outlier detection for applications like fraud detection by offering a more efficient method, though it is incremental as it builds directly on LOF.
The paper tackles the computational expense of the Local Outlier Factor (LOF) method for outlier detection by proposing a pruning-based procedure called PLOF, which reduces execution time while maintaining or improving accuracy and precision compared to LOF and its variants.
Outlier detection (also known as anomaly detection or deviation detection) is a process of detecting data points in which their patterns deviate significantly from others. It is common to have outliers in industry applications, which could be generated by different causes such as human error, fraudulent activities, or system failure. Recently, density-based methods have shown promising results, particularly among which Local Outlier Factor (LOF) is arguably dominating. However, one of the major drawbacks of LOF is that it is computationally expensive. Motivated by the mentioned problem, this research presents a novel pruning-based procedure in which the execution time of LOF is reduced while the performance is maintained. A novel Prune-based Local Outlier Factor (PLOF) approach is proposed, in which prior to employing LOF, outlierness of each data instance is measured. Next, based on a threshold, data instances that require further investigation are separated and LOF score is only computed for these points. Extensive experiments have been conducted and results are promising. Comparison experiments with the original LOF and two state-of-the-art variants of LOF have shown that PLOF produces higher accuracy and precision while reducing execution time.