CLMar 26, 2024

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

arXiv:2403.17359v234 citationsh-index: 10ICLR
Originality Highly original
AI Analysis

This addresses the issue of unreliable and limited reasoning in QA systems for users needing accurate, real-time information from diverse sources, representing a novel method for a known bottleneck.

The paper tackles the problems of unfaithful hallucination and weak reasoning in multimodal and retrieval-augmented question answering by introducing the Chain-of-Action (CoA) framework, which uses a novel reasoning-retrieval mechanism and achieves improved performance over other methods on public benchmarks and a Web3 case study.

We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). Compared to the literature, CoA overcomes two major challenges of current QA applications: (i) unfaithful hallucination that is inconsistent with real-time or domain facts and (ii) weak reasoning performance over compositional information. Our key contribution is a novel reasoning-retrieval mechanism that decomposes a complex question into a reasoning chain via systematic prompting and pre-designed actions. Methodologically, we propose three types of domain-adaptable `Plug-and-Play' actions for retrieving real-time information from heterogeneous sources. We also propose a multi-reference faith score (MRFS) to verify and resolve conflicts in the answers. Empirically, we exploit both public benchmarks and a Web3 case study to demonstrate the capability of CoA over other methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes