AIMay 5, 2014

Finding Inner Outliers in High Dimensional Space

arXiv:1405.0868v11 citations
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in outlier detection for high-dimensional databases, with potential applications in multimedia processing, but it appears incremental as it builds on subspace detection methods.

The paper tackles the problem of detecting inner outliers hidden within normal points in high-dimensional data, where existing methods fail, and proposes a method with twice dimension-projections that successfully finds all inner outliers on synthetic datasets with dimensions from 100 to 10000.

Outlier detection in a large-scale database is a significant and complex issue in knowledge discovering field. As the data distributions are obscure and uncertain in high dimensional space, most existing solutions try to solve the issue taking into account the two intuitive points: first, outliers are extremely far away from other points in high dimensional space; second, outliers are detected obviously different in projected-dimensional subspaces. However, for a complicated case that outliers are hidden inside the normal points in all dimensions, existing detection methods fail to find such inner outliers. In this paper, we propose a method with twice dimension-projections, which integrates primary subspace outlier detection and secondary point-projection between subspaces, and sums up the multiple weight values for each point. The points are computed with local density ratio separately in twice-projected dimensions. After the process, outliers are those points scoring the largest values of weight. The proposed method succeeds to find all inner outliers on the synthetic test datasets with the dimension varying from 100 to 10000. The experimental results also show that the proposed algorithm can work in low dimensional space and can achieve perfect performance in high dimensional space. As for this reason, our proposed approach has considerable potential to apply it in multimedia applications helping to process images or video with large-scale attributes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes