A random version of principal component analysis in data clustering
This addresses a limitation in data clustering for researchers dealing with high-dimensional data, but appears incremental as it modifies an existing method.
The paper tackled the problem of PCA's mathematical constraints on sample size in high-dimensional data by introducing a modified algorithm that works on both well-dimensioned and degenerated datasets, achieving functionality without specific numerical results.
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical constraints on the minimum number of different replicates or samples that must be included in the analysis. Here we show that a modified algorithm works not only on well dimensioned datasets, but also on degenerated ones.