LGMay 10, 2023

Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction

arXiv:2305.06042v24 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for data analysts handling large datasets with monotone missing data.

The paper tackles the computational expense of imputation combined with dimensionality reduction for monotone missing data by proposing a Blockwise Principal Component Analysis Imputation (BPI) framework, which significantly reduces imputation time and can lead to convergence with MICE imputation where direct application fails.

Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Principal Component Analysis (PCA) on the observed part of each monotone block of the data and then imputes on merging the obtained principal components using a chosen imputation technique. BPI can work with various imputation techniques and can significantly reduce imputation time compared to conducting dimensionality reduction after imputation. This makes it a practical and efficient approach for large datasets with monotone missing data. Our experiments validate the improvement in speed. In addition, our experiments also show that while applying MICE imputation directly on missing data may not yield convergence, applying BPI with MICE for the data may lead to convergence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes