IRAIApr 21

AgenticRecTune: Multi-Agent with Self-Evolving Skillhub for Recommendation System Optimization

arXiv:2604.2696966.2
Predicted impact top 43% in IR · last 90 daysOriginality Incremental advance
AI Analysis

For engineers managing large-scale recommendation systems, this automates the tedious and expertise-heavy process of re-tuning configurations after model changes, though the approach is incremental as it applies existing LLM reasoning to a known optimization bottleneck.

AgenticRecTune uses a multi-agent LLM framework to automate system-level configuration optimization in multi-stage recommendation pipelines, achieving efficient tuning without manual intervention. The framework autonomously proposes, tests, and learns from A/B experiments, reducing tuning effort while balancing multiple online metrics.

Modern large-scale recommendation systems are typically constructed as multi-stage pipelines, encompassing pre-ranking, ranking, and re-ranking phases. While traditional recommendation research typically focuses on optimizing a specific model, such as improving the pre-ranking model structure or ranking models training algorithm, system-level configurations optimization play a crucial role, which integrates the output from each model head to get the final score in each stage. Due to the complexity of the system, the configuration optimization is highly important and challenging. Any model modification requires new optimal system-level configurations. But each experimental iteration requires significant tuning effort. Furthermore, models in different stage operates within a distinct context and optimizes for different targets, requiring specialized domain expertise. In addition, optimization success depends on balancing competing multiple online metrics and alignment with shifting production development objectives. To address these challenges, we propose AgenticRecTune, an agentic framework comprising five specialized agents, Actor, Critic, Insight, Skill, and Online, designed to manage the end-to-end configuration optimization workflow. By leveraging the advanced reasoning of Large Language Models (LLMs), specifically Gemini, AgenticRecTune explore the optimal configuration spaces. The Actor Agent proposes multiple candidates and Critic Agent filters out suboptimal proposals.Then Online Agent autonomously prepares A/B tests based on the proposed configurations set from the Critic Agent and captures the subsequencet experimental results. We also introduce a self-evolving Skillhub, which utilizes a collaboration between the Insight Agent and Skill Agent to summarize the history results, extract underlying mechanics of each task in recommendation system and update skills.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes