Transparent Visual Reasoning via Object-Centric Agent Collaboration
This addresses the problem of producing human-understandable explanations for visual AI decisions, though it appears incremental as it builds on existing object-centric and multi-agent approaches.
The paper tackled the challenge of explainable AI in visual reasoning by introducing OCEAN, a framework using object-centric representations and multi-agent negotiation, which achieved competitive performance with state-of-the-art black-box models and was rated more intuitive and trustworthy in user studies.
A central challenge in explainable AI, particularly in the visual domain, is producing explanations grounded in human-understandable concepts. To tackle this, we introduce OCEAN (Object-Centric Explananda via Agent Negotiation), a novel, inherently interpretable framework built on object-centric representations and a transparent multi-agent reasoning process. The game-theoretic reasoning process drives agents to agree on coherent and discriminative evidence, resulting in a faithful and interpretable decision-making process. We train OCEAN end-to-end and benchmark it against standard visual classifiers and popular posthoc explanation tools like GradCAM and LIME across two diagnostic multi-object datasets. Our results demonstrate competitive performance with respect to state-of-the-art black-box models with a faithful reasoning process, which was reflected by our user study, where participants consistently rated OCEAN's explanations as more intuitive and trustworthy.