Sequential Logistic Principal Component Analysis (SLPCA): Dimensional Reduction in Streaming Multivariate Binary-State System
This work addresses the need for adaptive statistical modeling in streaming data applications like energy profiling, but it is incremental as it builds upon existing Logistic PCA methods.
The authors tackled the problem of dimensionality reduction for streaming multivariate binary-state data by extending batch Logistic PCA to a sequential version (SLPCA) using sequential convex optimization, and demonstrated its convergence and performance in building energy end-use profile modeling.
Sequential or online dimensional reduction is of interests due to the explosion of streaming data based applications and the requirement of adaptive statistical modeling, in many emerging fields, such as the modeling of energy end-use profile. Principal Component Analysis (PCA), is the classical way of dimensional reduction. However, traditional Singular Value Decomposition (SVD) based PCA fails to model data which largely deviates from Gaussian distribution. The Bregman Divergence was recently introduced to achieve a generalized PCA framework. If the random variable under dimensional reduction follows Bernoulli distribution, which occurs in many emerging fields, the generalized PCA is called Logistic PCA (LPCA). In this paper, we extend the batch LPCA to a sequential version (i.e. SLPCA), based on the sequential convex optimization theory. The convergence property of this algorithm is discussed compared to the batch version of LPCA (i.e. BLPCA), as well as its performance in reducing the dimension for multivariate binary-state systems. Its application in building energy end-use profile modeling is also investigated.