HCAIFeb 27, 2025

ACE, Action and Control via Explanations: A Proposal for LLMs to Provide Human-Centered Explainability for Multimodal AI Assistants

arXiv:2503.16466v13 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses human-centered explainability for multimodal AI assistants in manufacturing domains, though it appears incremental as a conceptual framework.

The paper tackles challenges in participatory design and training of multimodal AI assistants for manufacturing by proposing the ACE paradigm, which uses LLMs to generate human-interpretable semantic frame explanations that enable end users to provide data for aligning multimodal models.

In this short paper we address issues related to building multimodal AI systems for human performance support in manufacturing domains. We make two contributions: we first identify challenges of participatory design and training of such systems, and secondly, to address such challenges, we propose the ACE paradigm: "Action and Control via Explanations". Specifically, we suggest that LLMs can be used to produce explanations in the form of human interpretable "semantic frames", which in turn enable end users to provide data the AI system needs to align its multimodal models and representations, including computer vision, automatic speech recognition, and document inputs. ACE, by using LLMs to "explain" using semantic frames, will help the human and the AI system to collaborate, together building a more accurate model of humans activities and behaviors, and ultimately more accurate predictive outputs for better task support, and better outcomes for human users performing manual tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes