A simple efficient density estimator that enables fast systematic search
This work addresses a bottleneck in outlying aspects mining for data analysis, allowing it to scale to larger datasets, though it is incremental as it improves an existing method.
The paper tackles the computational inefficiency of kernel density estimators in outlying aspects mining by introducing a simple and efficient density estimator, enabling a recent miner to run orders of magnitude faster and handle large datasets with thousands of dimensions.
This paper introduces a simple and efficient density estimator that enables fast systematic search. To show its advantage over commonly used kernel density estimator, we apply it to outlying aspects mining. Outlying aspects mining discovers feature subsets (or subspaces) that describe how a query stand out from a given dataset. The task demands a systematic search of subspaces. We identify that existing outlying aspects miners are restricted to datasets with small data size and dimensions because they employ kernel density estimator, which is computationally expensive, for subspace assessments. We show that a recent outlying aspects miner can run orders of magnitude faster by simply replacing its density estimator with the proposed density estimator, enabling it to deal with large datasets with thousands of dimensions that would otherwise be impossible.