LGApr 10, 2023

Data Imputation from the Perspective of Graph Dirichlet Energy

Weiqi Zhang, Guanlue Li, Jianheng Tang, Jia Li, Fugee Tsung

arXiv:2304.04474v22.02 citationsh-index: 48Has Code

Originality Incremental advance

AI Analysis

This work addresses data imputation for handling missing data in real-world applications, presenting an incremental improvement over existing methods.

The paper tackles the problem of data imputation by analyzing the 'draft-then-refine' strategy through graph Dirichlet energy, finding that existing methods reduce energy, and introduces GLPN to restore balance, achieving state-of-the-art performance across multiple datasets and missing data mechanisms.

Data imputation is a crucial task due to the widespread occurrence of missing data. Many methods adopt a two-step approach: initially crafting a preliminary imputation (the "draft") and then refining it to produce the final missing data imputation result, commonly referred to as "draft-then-refine". In our study, we examine this prevalent strategy through the lens of graph Dirichlet energy. We observe that a basic "draft" imputation tends to decrease the Dirichlet energy. Therefore, a subsequent "refine" step is necessary to restore the overall energy balance. Existing refinement techniques, such as the Graph Convolutional Network (GCN), often result in further energy reduction. To address this, we introduce a new framework, the Graph Laplacian Pyramid Network (GLPN). GLPN incorporates a U-shaped autoencoder and residual networks to capture both global and local details effectively. Through extensive experiments on multiple real-world datasets, GLPN consistently outperforms state-of-the-art methods across three different missing data mechanisms. The code is available at https://github.com/liguanlue/GLPN.

View on arXiv PDF Code

Similar