LGDec 6, 2022

Data Imputation with Iterative Graph Reconstruction

arXiv:2212.02810v213.631 citationsh-index: 7Has Code

Originality Incremental advance

AI Analysis

This work addresses data imputation in tabular data, offering a domain-specific improvement over existing graph-based methods.

The paper tackles the problem of missing data imputation by introducing a novel framework that uses iterative graph generation and reconstruction to differentiate between samples based on similarity, achieving a 39.13% lower mean absolute error compared to nine baselines.

Effective data imputation demands rich latent ``structure" discovery capabilities from ``plain" tabular data. Recent advances in graph neural networks-based data imputation solutions show their strong structure learning potential by directly translating tabular data as bipartite graphs. However, due to a lack of relations between samples, those solutions treat all samples equally which is against one important observation: ``similar sample should give more information about missing values." This paper presents a novel Iterative graph Generation and Reconstruction framework for Missing data imputation(IGRM). Instead of treating all samples equally, we introduce the concept: ``friend networks" to represent different relations among samples. To generate an accurate friend network with missing data, an end-to-end friend network reconstruction solution is designed to allow for continuous friend network optimization during imputation learning. The representation of the optimized friend network, in turn, is used to further optimize the data imputation process with differentiated message passing. Experiment results on eight benchmark datasets show that IGRM yields 39.13% lower mean absolute error compared with nine baselines and 9.04% lower than the second-best. Our code is available at https://github.com/G-AILab/IGRM.

View on arXiv PDF Code

Similar