LG CVApr 8, 2022

Controllable Missingness from Uncontrollable Missingness: Joint Learning Measurement Policy and Imputation

Seongwook Yoon, Jaehyun Kim, Heejeong Lim, Sanghoon Sull

arXiv:2204.03872v11.8h-index: 16

Originality Incremental advance

AI Analysis

This work addresses the challenge of controlling measurement systems in data collection for imputation, which is incremental as it builds on existing methods for handling missing data.

The paper tackles the problem of learning an optimal measurement policy and imputation method for incomplete data, where complete data is unavailable for training, by proposing a joint learning algorithm and data generation method. The results show that the algorithm is generally applicable and outperforms baseline methods across various missing rates on two datasets.

Due to the cost or interference of measurement, we need to control measurement system. Assuming that each variable can be measured sequentially, there exists optimal policy choosing next measurement for the former observations. Though optimal measurement policy is actually dependent on the goal of measurement, we mainly focus on retrieving complete data, so called as imputation. Also, we adapt the imputation method to missingness varying with measurement policy. However, learning measurement policy and imputation requires complete data which is impossible to be observed, unfortunately. To tackle this problem, we propose a data generation method and joint learning algorithm. The main idea is that 1) the data generation method is inherited by imputation method, and 2) the adaptation of imputation encourages measurement policy to learn more than individual learning. We implemented some variations of proposed algorithm for two different datasets and various missing rates. From the experimental results, we demonstrate that our algorithm is generally applicable and outperforms baseline methods.

View on arXiv PDF

Similar