LGIRMLApr 25, 2019

Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings

arXiv:1904.11547v1219 citations
Originality Incremental advance
AI Analysis

This addresses a critical issue in computational advertising for platforms dealing with new ads, though it is an incremental improvement over existing embedding techniques.

The paper tackles the cold-start problem in CTR prediction for new ads with little data by proposing Meta-Embedding, a meta-learning approach that learns to generate initial embeddings for ad IDs, resulting in significant improvements in both cold-start and warm-up phases across multiple models and datasets.

Click-through rate (CTR) prediction has been one of the most central problems in computational advertising. Lately, embedding techniques that produce low-dimensional representations of ad IDs drastically improve CTR prediction accuracies. However, such learning techniques are data demanding and work poorly on new ads with little logging data, which is known as the cold-start problem. In this paper, we aim to improve CTR predictions during both the cold-start phase and the warm-up phase when a new ad is added to the candidate pool. We propose Meta-Embedding, a meta-learning-based approach that learns to generate desirable initial embeddings for new ad IDs. The proposed method trains an embedding generator for new ad IDs by making use of previously learned ads through gradient-based meta-learning. In other words, our method learns how to learn better embeddings. When a new ad comes, the trained generator initializes the embedding of its ID by feeding its contents and attributes. Next, the generated embedding can speed up the model fitting during the warm-up phase when a few labeled examples are available, compared to the existing initialization methods. Experimental results on three real-world datasets showed that Meta-Embedding can significantly improve both the cold-start and warm-up performances for six existing CTR prediction models, ranging from lightweight models such as Factorization Machines to complicated deep models such as PNN and DeepFM. All of the above apply to conversion rate (CVR) predictions as well.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes