IRLGMay 15, 2023

FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning

arXiv:2305.08328v117 citations
Originality Synthesis-oriented
AI Analysis

This provides a standardized benchmark for researchers and practitioners in advertising and privacy-preserving machine learning, though it is incremental as it builds on existing vFL and CVR estimation methods.

The paper tackles the lack of standardized evaluations for vertical federated learning (vFL) in conversion rate (CVR) estimation by introducing FedAds, a benchmark with a large-scale real-world dataset from Alibaba, which includes systematic assessments of effectiveness and privacy for various vFL algorithms.

Conversion rate (CVR) estimation aims to predict the probability of conversion event after a user has clicked an ad. Typically, online publisher has user browsing interests and click feedbacks, while demand-side advertising platform collects users' post-click behaviors such as dwell time and conversion decisions. To estimate CVR accurately and protect data privacy better, vertical federated learning (vFL) is a natural solution to combine two sides' advantages for training models, without exchanging raw data. Both CVR estimation and applied vFL algorithms have attracted increasing research attentions. However, standardized and systematical evaluations are missing: due to the lack of standardized datasets, existing studies adopt public datasets to simulate a vFL setting via hand-crafted feature partition, which brings challenges to fair comparison. We introduce FedAds, the first benchmark for CVR estimation with vFL, to facilitate standardized and systematical evaluations for vFL algorithms. It contains a large-scale real world dataset collected from Alibaba's advertising platform, as well as systematical evaluations for both effectiveness and privacy aspects of various vFL algorithms. Besides, we also explore to incorporate unaligned data in vFL to improve effectiveness, and develop perturbation operations to protect privacy well. We hope that future research work in vFL and CVR estimation benefits from the FedAds benchmark.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes