Where Did It All Go Wrong? A Hierarchical Look into Multi-Agent Error Attribution
This addresses debugging challenges in collaborative AI systems, though it appears incremental as it builds on existing error attribution approaches.
The paper tackles the problem of error attribution in LLM multi-agent systems by introducing ECHO, a novel algorithm that combines hierarchical context representation, objective analysis, and consensus voting to improve accuracy in pinpointing agent and step-level failures. Experimental results show ECHO outperforms existing methods across various multi-agent interaction scenarios, particularly in cases with subtle reasoning errors and complex interdependencies.
Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in interaction traces - whether using all-at-once evaluation, step-by-step analysis, or binary search - fall short when analyzing complex patterns, struggling with both accuracy and consistency. We present ECHO (Error attribution through Contextual Hierarchy and Objective consensus analysis), a novel algorithm that combines hierarchical context representation, objective analysis-based evaluation, and consensus voting to improve error attribution accuracy. Our approach leverages a positional-based leveling of contextual understanding while maintaining objective evaluation criteria, ultimately reaching conclusions through a consensus mechanism. Experimental results demonstrate that ECHO outperforms existing methods across various multi-agent interaction scenarios, showing particular strength in cases involving subtle reasoning errors and complex interdependencies. Our findings suggest that leveraging these concepts of structured, hierarchical context representation combined with consensus-based objective decision-making, provides a more robust framework for error attribution in multi-agent systems.