LG MLMay 24, 2024

Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random

Mingming Ha, Xuewen Tao, Wenfang Lin, Qionxu Ma, Wujiang Xu, Linxun Chen

arXiv:2405.15403v17.99 citationsh-index: 11NIPS

Originality Incremental advance

AI Analysis

This work addresses the issue of unstable and unbounded variance in estimators for missing-not-at-random data, which is critical for improving robustness in domains like recommendation systems and advertising, though it is incremental as it builds on existing bias-variance trade-off concepts.

The paper tackles the problem of missing-not-at-random data in applications like recommendation systems, which degrades model performance, by proposing a fine-grained dynamic framework that jointly optimizes bias and variance, reducing and bounding generalization bounds and variances with theoretical guarantees.

In most practical applications such as recommendation systems, display advertising, and so forth, the collected data often contains missing values and those missing values are generally missing-not-at-random, which deteriorates the prediction performance of models. Some existing estimators and regularizers attempt to achieve unbiased estimation to improve the predictive performance. However, variances and generalization bound of these methods are generally unbounded when the propensity scores tend to zero, compromising their stability and robustness. In this paper, we first theoretically reveal that limitations of regularization techniques. Besides, we further illustrate that, for more general estimators, unbiasedness will inevitably lead to unbounded variance. These general laws inspire us that the estimator designs is not merely about eliminating bias, reducing variance, or simply achieve a bias-variance trade-off. Instead, it involves a quantitative joint optimization of bias and variance. Then, we develop a systematic fine-grained dynamic learning framework to jointly optimize bias and variance, which adaptively selects an appropriate estimator for each user-item pair according to the predefined objective function. With this operation, the generalization bounds and variances of models are reduced and bounded with theoretical guarantees. Extensive experiments are conducted to verify the theoretical results and the effectiveness of the proposed dynamic learning framework.

View on arXiv PDF

Similar