Handling many conversions per click in modeling delayed feedback
This work is significant for advertisers and platforms that rely on performance-based digital advertising, as it aims to improve the accuracy of conversion optimization models by better handling delayed and multiple conversions.
The paper addresses the challenge of predicting post-click conversions in digital advertising, where conversions can occur multiple times with varying, long-tailed, and dynamic delays. They propose an unbiased estimation model that splits labels into delay buckets, uses thermometer encoding, and incorporates auxiliary information.
Predicting the expected value or number of post-click conversions (purchases or other events) is a key task in performance-based digital advertising. In training a conversion optimizer model, one of the most crucial aspects is handling delayed feedback with respect to conversions, which can happen multiple times with varying delay. This task is difficult, as the delay distribution is different for each advertiser, is long-tailed, often does not follow any particular class of parametric distributions, and can change over time. We tackle these challenges using an unbiased estimation model based on three core ideas. The first idea is to split the label as a sum of labels with different delay buckets, each of which trains only on mature label, the second is to use thermometer encoding to increase accuracy and reduce inference cost, and the third is to use auxiliary information to increase the stability of the model and to handle drift in the distribution.