Handoff Debt: The Rediscovery Cost When Coding Agents Take Over Interrupted Tasks
For developers and evaluators of coding agents, this work highlights the need to measure handoff efficiency, not just task completion, in realistic multi-agent workflows.
The paper introduces 'handoff debt' as the rediscovery cost when a coding agent resumes an interrupted task left by another agent, and shows that providing context (e.g., summary notes) reduces median agent events by 20–59% and cumulative prompt tokens by 42–63% compared to repository-only takeover.
Coding-agent benchmarks evaluate whether a single uninterrupted agent can resolve a repository issue. Real software work is messier: tasks are interrupted, reassigned, reviewed, and resumed from partial states left by another agent or engineer. We study this missing dimension through \emph{handoff debt}: the rediscovery cost imposed when a predecessor's work is opaque or incomplete. Our takeover protocol interrupts a coding agent at deterministic handoff points, freezes the repository, and evaluates successor agents under four handoff views: repository state only, raw trace, summary notes, and structured notes. Across 75 source tasks, the protocol generates 181 handoff-point tasks and 724 takeover runs per successor model. Across three successor models, context-bearing handoffs reduce median agent events by 20--59\% and cumulative prompt tokens by 42--63\% relative to repository-only takeover. Solved-rate effects are smaller and model-dependent, but efficiency gains are consistent. These findings suggest that coding-agent evaluation should report not only whether a task is solved, but also how costly that work is for another agent to resume.