LGNov 12, 2022

A Generalized Doubly Robust Learning Framework for Debiasing Post-Click Conversion Rate Prediction

Peking U
arXiv:2211.06684v167 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses selection bias in CVR prediction for industrial applications, offering incremental improvements over existing doubly robust methods.

The paper tackles the problem of selection bias in post-click conversion rate prediction by proposing a generalized doubly robust learning framework and two new methods, DR-BIAS and DR-MSE, which improve generalization performance, as validated through experiments on real-world and semi-synthetic datasets.

Post-click conversion rate (CVR) prediction is an essential task for discovering user interests and increasing platform revenues in a range of industrial applications. One of the most challenging problems of this task is the existence of severe selection bias caused by the inherent self-selection behavior of users and the item selection process of systems. Currently, doubly robust (DR) learning approaches achieve the state-of-the-art performance for debiasing CVR prediction. However, in this paper, by theoretically analyzing the bias, variance and generalization bounds of DR methods, we find that existing DR approaches may have poor generalization caused by inaccurate estimation of propensity scores and imputation errors, which often occur in practice. Motivated by such analysis, we propose a generalized learning framework that not only unifies existing DR methods, but also provides a valuable opportunity to develop a series of new debiasing techniques to accommodate different application scenarios. Based on the framework, we propose two new DR methods, namely DR-BIAS and DR-MSE. DR-BIAS directly controls the bias of DR loss, while DR-MSE balances the bias and variance flexibly, which achieves better generalization performance. In addition, we propose a novel tri-level joint learning optimization method for DR-MSE in CVR prediction, and an efficient training algorithm correspondingly. We conduct extensive experiments on both real-world and semi-synthetic datasets, which validate the effectiveness of our proposed methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes