IRApr 21

CS3: Efficient Online Capability Synergy for Two-Tower Recommendation

arXiv:2604.1926965.9
Predicted impact top 44% in IR · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners of large-scale recommender systems, CS3 offers a practical, plug-and-play solution to improve two-tower retrieval performance under strict latency constraints.

CS3 improves two-tower recommender systems by introducing three mechanisms (Cycle-Adaptive Structure, Cross-Tower Synchronization, Cascade-Model Sharing) that enhance representation capacity and alignment without increasing latency. In a large-scale advertising system, it achieves up to 8.36% revenue improvement while maintaining ms-level latency.

To balance effectiveness and efficiency in recommender systems, multi-stage pipelines commonly use lightweight two-tower models for large-scale candidate retrieval. However, the isolated two-tower architecture restricts representation capacity, embedding-space alignment, and cross-feature interactions. Existing solutions such as late interaction and knowledge distillation can mitigate these issues, but often increase latency or are difficult to deploy in online learning settings. We propose Capability Synergy (CS3), an efficient online framework that strengthens two-tower retrievers while preserving real-time constraints. CS3 introduces three mechanisms: (1) Cycle-Adaptive Structure for self-revision via adaptive feature denoising within each tower; (2) Cross-Tower Synchronization to improve alignment through lightweight mutual awareness between towers; and (3) Cascade-Model Sharing to enhance cross-stage consistency by reusing knowledge from downstream models. CS3 is plug-and-play with diverse two-tower backbones and compatible with online learning. Experiments on three public datasets show consistent gains over strong baselines, and deployment in a largescale advertising system yields up to 8.36% revenue improvement across three scenarios while maintaining ms-level latency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes