CVSep 5, 2023

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning

arXiv:2309.02155v114 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses the challenge of producing trustworthy and consistent explanations for VQA systems, which is crucial for user trust, but it is incremental as it builds on existing semi-supervised and self-critical methods.

The paper tackled the problem of generating natural language explanations for VQA models by addressing logical inconsistency and high annotation costs, achieving state-of-the-art performance on two VQA-NLE datasets through a semi-supervised self-critical learning approach.

VQA Natural Language Explanation (VQA-NLE) task aims to explain the decision-making process of VQA models in natural language. Unlike traditional attention or gradient analysis, free-text rationales can be easier to understand and gain users' trust. Existing methods mostly use post-hoc or self-rationalization models to obtain a plausible explanation. However, these frameworks are bottlenecked by the following challenges: 1) the reasoning process cannot be faithfully responded to and suffer from the problem of logical inconsistency. 2) Human-annotated explanations are expensive and time-consuming to collect. In this paper, we propose a new Semi-Supervised VQA-NLE via Self-Critical Learning (S3C), which evaluates the candidate explanations by answering rewards to improve the logical consistency between answers and rationales. With a semi-supervised learning framework, the S3C can benefit from a tremendous amount of samples without human-annotated explanations. A large number of automatic measures and human evaluations all show the effectiveness of our method. Meanwhile, the framework achieves a new state-of-the-art performance on the two VQA-NLE datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes