LG AIJun 26, 2021

FCMI: Feature Correlation based Missing Data Imputation

Prateek Mishra, Kumar Divya Mani, Prashant Johri, Dikhsa Arya

arXiv:2107.00100v14 citations

Originality Incremental advance

AI Analysis

This addresses data reliability issues for data analysis and prediction tasks, but appears incremental as it builds on existing correlation-based imputation methods.

The paper tackles the problem of missing data in datasets by proposing FCMI, a feature correlation-based imputation technique that uses highly correlated attributes to build an optimized regression model, and reports that it outperforms existing imputation algorithms in experiments on classification and regression datasets.

Processed data are insightful, and crude data are obtuse. A serious threat to data reliability is missing values. Such data leads to inaccurate analysis and wrong predictions. We propose an efficient technique to impute the missing value in the dataset based on correlation called FCMI (Feature Correlation based Missing Data Imputation). We have considered the correlation of the attributes of the dataset, and that is our central idea. Our proposed algorithm picks the highly correlated attributes of the dataset and uses these attributes to build a regression model whose parameters are optimized such that the correlation of the dataset is maintained. Experiments conducted on both classification and regression datasets show that the proposed imputation technique outperforms existing imputation algorithms.

View on arXiv PDF

Similar