IRMay 22

Memento: Personalized RAG-Style Long-Retention Data Scaling for META Ads Recommendation

arXiv:2605.2405118.6
Predicted impact top 60% in IR · last 90 daysOriginality Incremental advance
AI Analysis

For large-scale ads recommendation systems, Memento solves long-context modeling challenges with a practical, efficient solution that yields significant business metrics improvements.

Memento introduces a personalized retrieval-augmented framework for ads recommendation that scales user history to 365+ days, achieving 5-10x resource efficiency over linear scaling and delivering 1% CTR lift and 1.2% CVR lift in production.

Modeling of long history data suffers from long-context window attention dilution, system efficiency and catastrophic forgetting problems, where naive linear scaling approach like LastN would fail. We introduce Memento, a personalized retrieval-augmented framework that treats historical user engagements as a document corpus and ad requests as queries, retrieving relevant interactions via Maximal Marginal Relevance (MMR) to balance similarity with diversity. We identify two complementary applications: Representation Memento, which retrieves historical embeddings for feature augmentation, and Data Memento, which retrieves past training examples for multipass training. Through infrastructure co-design -- temporal chunking, INT8 quantization, and asynchronous serving -- Memento achieves 5-10$\times$ resource efficiency over linear scaling. Memento processes daily requests with sub-10ms latency, yielding 0.25-0.3% Normalized Entropy gain on both click-through and conversion prediction. In production, Memento delivers a 1% CTR lift on Facebook Feed and Reels and a 1.2% CVR lift, scaling personalization to 365+ days of history.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes