LGDec 11, 2025

Inference Time Feature Injection: A Lightweight Approach for Real-Time Recommendation Freshness

arXiv:2512.14734v1
Originality Incremental advance
AI Analysis

This addresses the issue of stale recommendations for users of long-form video streaming services, though it is incremental as it builds on existing batch-trained models.

The paper tackled the problem of stale recommendations in long-form video streaming by introducing a lightweight, model-agnostic approach that injects recent user watch history at inference time, resulting in a statistically significant 0.47% increase in key user engagement metrics.

Many recommender systems in long-form video streaming reply on batch-trained models and batch-updated features, where user features are updated daily and served statically throughout the day. While efficient, this approach fails to incorporate a user's most recent actions, often resulting in stale recommendations. In this work, we present a lightweight, model-agnostic approach for intra-day personalization that selectively injects recent watch history at inference time without requiring model retraining. Our approach selectively overrides stale user features at inference time using the recent watch history, allowing the system to adapt instantly to evolving preferences. By reducing the personalization feedback loop from daily to intra-day, we observed a statistically significant 0.47% increase in key user engagement metrics which ranked among the most substantial engagement gains observed in recent experimentation cycles. To our knowledge, this is the first published evidence that intra-day personalization can drive meaningful impact in long-form video streaming service, providing a compelling alternative to full real-time architectures where model retraining is required.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes