MELGMay 2, 2023

MISNN: Multiple Imputation via Semi-parametric Neural Networks

arXiv:2305.01794v1
Originality Highly original
AI Analysis

This addresses a computational bottleneck for researchers in biomedical, social, and econometric fields dealing with missing data, offering a more efficient and accurate solution.

The paper tackles the problem of multiple imputation with feature selection in high-dimensional data, where existing methods are inefficient and perform poorly. The proposed MISNN algorithm demonstrates superior imputation accuracy, statistical consistency, and computation speed compared to state-of-the-art methods like Bayesian Lasso and matrix completion.

Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis. In the presence of high-dimensional data, imputation models that include feature selection, especially $\ell_1$ regularized regression (such as Lasso, adaptive Lasso, and Elastic Net), are common choices to prevent the model from underdetermination. However, conducting MI with feature selection is difficult: existing methods are often computationally inefficient and poor in performance. We propose MISNN, a novel and efficient algorithm that incorporates feature selection for MI. Leveraging the approximation power of neural networks, MISNN is a general and flexible framework, compatible with any feature selection method, any neural network architecture, high/low-dimensional data and general missing patterns. Through empirical experiments, MISNN has demonstrated great advantages over state-of-the-art imputation methods (e.g. Bayesian Lasso and matrix completion), in terms of imputation accuracy, statistical consistency and computation speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes