CLAICVApr 7, 2020

e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language Explanations

arXiv:2004.03744v338 citations
Originality Synthesis-oriented
AI Analysis

This work addresses data quality issues in a multimodal reasoning dataset for researchers, but it is incremental as it builds on existing resources.

The authors tackled errors in the SNLI-VE dataset for visual-textual entailment by correcting labels and adding human-written explanations, resulting in a new dataset (e-SNLI-VE) that improved model performance, with specific gains reported in re-evaluations.

The recently proposed SNLI-VE corpus for recognising visual-textual entailment is a large, real-world dataset for fine-grained multimodal reasoning. However, the automatic way in which SNLI-VE has been assembled (via combining parts of two related datasets) gives rise to a large number of errors in the labels of this corpus. In this paper, we first present a data collection effort to correct the class with the highest error rate in SNLI-VE. Secondly, we re-evaluate an existing model on the corrected corpus, which we call SNLI-VE-2.0, and provide a quantitative comparison with its performance on the non-corrected corpus. Thirdly, we introduce e-SNLI-VE, which appends human-written natural language explanations to SNLI-VE-2.0. Finally, we train models that learn from these explanations at training time, and output such explanations at testing time.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes