Computation of the Maximum Likelihood estimator in low-rank Factor Analysis
This work addresses computational bottlenecks in factor analysis for dimensionality reduction in fields like statistics and data science, offering incremental algorithmic advancements.
The paper tackles the challenging nonconvex optimization problem in maximum likelihood factor analysis by reformulating it as a nonlinear nonsmooth semidefinite optimization problem and proposing fast, scalable algorithms based on difference of convex optimization, with numerical experiments showing significant improvements over existing state-of-the-art approaches.
Factor analysis, a classical multivariate statistical technique is popularly used as a fundamental tool for dimensionality reduction in statistics, econometrics and data science. Estimation is often carried out via the Maximum Likelihood (ML) principle, which seeks to maximize the likelihood under the assumption that the positive definite covariance matrix can be decomposed as the sum of a low rank positive semidefinite matrix and a diagonal matrix with nonnegative entries. This leads to a challenging rank constrained nonconvex optimization problem. We reformulate the low rank ML Factor Analysis problem as a nonlinear nonsmooth semidefinite optimization problem, study various structural properties of this reformulation and propose fast and scalable algorithms based on difference of convex (DC) optimization. Our approach has computational guarantees, gracefully scales to large problems, is applicable to situations where the sample covariance matrix is rank deficient and adapts to variants of the ML problem with additional constraints on the problem parameters. Our numerical experiments demonstrate the significant usefulness of our approach over existing state-of-the-art approaches.