IRLGJul 26, 2024

FedUD: Exploiting Unaligned Data for Cross-Platform Federated Click-Through Rate Prediction

arXiv:2407.18472v14 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses a privacy-preserving method for cross-platform advertising, though it is incremental by extending traditional VFL with knowledge distillation.

The paper tackles the problem of limited data alignment in vertical federated learning for click-through rate prediction by proposing FedUD, which exploits both aligned and unaligned data across platforms, resulting in improved prediction accuracy as demonstrated on real-world datasets.

Click-through rate (CTR) prediction plays an important role in online advertising platforms. Most existing methods use data from the advertising platform itself for CTR prediction. As user behaviors also exist on many other platforms, e.g., media platforms, it is beneficial to further exploit such complementary information for better modeling user interest and for improving CTR prediction performance. However, due to privacy concerns, data from different platforms cannot be uploaded to a server for centralized model training. Vertical federated learning (VFL) provides a possible solution which is able to keep the raw data on respective participating parties and learn a collaborative model in a privacy-preserving way. However, traditional VFL methods only utilize aligned data with common keys across parties, which strongly restricts their application scope. In this paper, we propose FedUD, which is able to exploit unaligned data, in addition to aligned data, for more accurate federated CTR prediction. FedUD contains two steps. In the first step, FedUD utilizes aligned data across parties like traditional VFL, but it additionally includes a knowledge distillation module. This module distills useful knowledge from the guest party's high-level representations and guides the learning of a representation transfer network. In the second step, FedUD applies the learned knowledge to enrich the representations of the host party's unaligned data such that both aligned and unaligned data can contribute to federated model training. Experiments on two real-world datasets demonstrate the superior performance of FedUD for federated CTR prediction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes