Intrinsic dimension and its application to association rules
This addresses a fundamental challenge in data mining for researchers and practitioners dealing with sparse high-dimensional data, though it appears incremental as it builds on existing concepts.
The paper tackles the curse of dimensionality in association rules by introducing a computationally feasible method to measure its extent in datasets, enabling the application of geometric analysis methods in high-dimensional machine learning.
The curse of dimensionality in the realm of association rules is twofold. Firstly, we have the well known exponential increase in computational complexity with increasing item set size. Secondly, there is a \emph{related curse} concerned with the distribution of (spare) data itself in high dimension. The former problem is often coped with by projection, i.e., feature selection, whereas the best known strategy for the latter is avoidance. This work summarizes the first attempt to provide a computationally feasible method for measuring the extent of dimension curse present in a data set with respect to a particular class machine of learning procedures. This recent development enables the application of various other methods from geometric analysis to be investigated and applied in machine learning procedures in the presence of high dimension.