Outlier Detection Using Vector Cosine Similarity by Adding a Dimension
This is an incremental improvement for outlier detection in multi-dimensional datasets.
The paper tackles outlier detection in multi-dimensional data by proposing a method that uses vector cosine similarity after adding a dimension with zero values, resulting in an optimized implementation called MDOD available on PyPI.
We propose a new outlier detection method for multi-dimensional data. The method detects outliers based on vector cosine similarity, using a new dataset constructed by adding a dimension with zero values to the original data. When a point in the new dataset is selected as the measured point, an observation point is created as the origin, differing only in the new dimension by having a non-zero value compared to the measured point. Vectors are then formed from the observation point to the measured point and to other points in the dataset. By comparing the cosine similarities of these vectors, abnormal data can be identified. An optimized implementation (MDOD) is available on PyPI: https://pypi.org/project/mdod/.