LGSTOct 9, 2023

On the Correlation between Random Variables and their Principal Components

arXiv:2310.06139v1h-index: 6
Originality Synthesis-oriented
AI Analysis

This provides a theoretical connection between PCA and Factor Analysis that could help optimize dimensionality reduction in statistical modeling.

The paper derived an algebraic formula for correlation coefficients between random variables and their principal components, showing this formula is identical to factor loadings in Factor Analysis. The result enables optimization of component numbers in both Principal Component Analysis and Factor Analysis.

The article attempts to find an algebraic formula describing the correlation coefficients between random variables and the principal components representing them. As a result of the analysis, starting from selected statistics relating to individual random variables, the equivalents of these statistics relating to a set of random variables were presented in the language of linear algebra, using the concepts of vector and matrix. This made it possible, in subsequent steps, to derive the expected formula. The formula found is identical to the formula used in Factor Analysis to calculate factor loadings. The discussion showed that it is possible to apply this formula to optimize the number of principal components in Principal Component Analysis, as well as to optimize the number of factors in Factor Analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes