AICLFeb 5

AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction

arXiv:2602.05353v2h-index: 7
Originality Incremental advance
AI Analysis

This addresses the challenge of making agentic systems more interpretable and controllable for users, though it is incremental as it builds on existing search and optimization methods.

The paper tackles the problem of interpreting and controlling opaque agentic systems by introducing Agentic Workflow Reconstruction (AWR) to synthesize explicit, interpretable workflows from black-box systems using only input-output access, with AgentXRay achieving higher proxy similarity and reduced token consumption compared to unpruned search.

Large Language Models have shown strong capabilities in complex problem solving, yet many agentic systems remain difficult to interpret and control due to opaque internal workflows. While some frameworks offer explicit architectures for collaboration, many deployed agentic systems operate as black boxes to users. We address this by introducing Agentic Workflow Reconstruction (AWR), a new task aiming to synthesize an explicit, interpretable stand-in workflow that approximates a black-box system using only input--output access. We propose AgentXRay, a search-based framework that formulates AWR as a combinatorial optimization problem over discrete agent roles and tool invocations in a chain-structured workflow space. Unlike model distillation, AgentXRay produces editable white-box workflows that match target outputs under an observable, output-based proxy metric, without accessing model parameters. To navigate the vast search space, AgentXRay employs Monte Carlo Tree Search enhanced by a scoring-based Red-Black Pruning mechanism, which dynamically integrates proxy quality with search depth. Experiments across diverse domains demonstrate that AgentXRay achieves higher proxy similarity and reduces token consumption compared to unpruned search, enabling deeper workflow exploration under fixed iteration budgets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes