QMLGOct 27, 2016

A random version of principal component analysis in data clustering

arXiv:1610.08664v1
Originality Synthesis-oriented
AI Analysis

This addresses a limitation in data clustering for researchers dealing with high-dimensional data, but appears incremental as it modifies an existing method.

The paper tackled the problem of PCA's mathematical constraints on sample size in high-dimensional data by introducing a modified algorithm that works on both well-dimensioned and degenerated datasets, achieving functionality without specific numerical results.

Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical constraints on the minimum number of different replicates or samples that must be included in the analysis. Here we show that a modified algorithm works not only on well dimensioned datasets, but also on degenerated ones.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes