Automatic Historical Feature Generation through Tree-based Method in Ads Prediction
This work provides an incremental improvement in feature engineering for ad platforms by automating the generation of historical features, potentially leading to better ad targeting and revenue.
This paper addresses the challenge of automatically identifying counting keys for historical feature generation in ads click-through rate (CTR) prediction. The proposed tree-based method automatically identified counting features that outperformed manually curated features in both online learning and offline training settings on Twitter video advertising data.
Historical features are important in ads click-through rate (CTR) prediction, because they account for past engagements between users and ads. In this paper, we study how to efficiently construct historical features through counting features. The key challenge of such problem lies in how to automatically identify counting keys. We propose a tree-based method for counting key selection. The intuition is that a decision tree naturally provides various combinations of features, which could be used as counting key candidate. In order to select personalized counting features, we train one decision tree model per user, and the counting keys are selected across different users with a frequency-based importance measure. To validate the effectiveness of proposed solution, we conduct large scale experiments on Twitter video advertising data. In both online learning and offline training settings, the automatically identified counting features outperform the manually curated counting features.