CLFeb 6, 2025

ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization

Yinjie Wang, Ling Yang, Guohao Li, Mengdi Wang, Bryon Aragam

arXiv:2502.04306v125.741 citationsh-index: 23Has Code

Originality Highly original

AI Analysis

This work addresses the problem of building efficient and scalable multi-agent systems for complex tasks like question answering and coding, offering a novel optimization approach that is incremental but provides strong performance gains.

The paper tackles the inflexibility and scalability issues in automated LLM agent workflow optimization by introducing ScoreFlow, a framework using gradient-based optimization and a novel Score-DPO method, achieving an 8.2% improvement over baselines across six benchmarks and enabling smaller models to outperform larger ones with lower costs.

Recent research has leveraged large language model multi-agent systems for complex problem-solving while trying to reduce the manual effort required to build them, driving the development of automated agent workflow optimization methods. However, existing methods remain inflexible due to representational limitations, a lack of adaptability, and poor scalability when relying on discrete optimization techniques. We address these challenges with ScoreFlow, a simple yet high-performance framework that leverages efficient gradient-based optimization in a continuous space. ScoreFlow incorporates Score-DPO, a novel variant of the direct preference optimization method that accounts for quantitative feedback. Across six benchmarks spanning question answering, coding, and mathematical reasoning, ScoreFlow achieves an 8.2% improvement over existing baselines. Moreover, it empowers smaller models to outperform larger ones with lower inference costs. Project: https://github.com/Gen-Verse/ScoreFlow

View on arXiv PDF Code

Similar