CLIRLGMar 15, 2021

Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence

arXiv:2103.08541v1774 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of robust fact verification for AI systems dealing with evolving information, though it is incremental as it builds on existing benchmarks with a novel design.

The authors tackled the problem of fact verification models being sensitive to subtle changes in evidence over time by creating VitaminC, a benchmark with over 400,000 contrastive claim-evidence pairs based on Wikipedia revisions, which improved model accuracy by 10% on adversarial fact verification and 6% on adversarial NLI.

Typical fact verification models use retrieved written evidence to verify claims. Evidence sources, however, often change over time as more information is gathered and revised. In order to adapt, models must be sensitive to subtle differences in supporting evidence. We present VitaminC, a benchmark infused with challenging cases that require fact verification models to discern and adjust to slight factual changes. We collect over 100,000 Wikipedia revisions that modify an underlying fact, and leverage these revisions, together with additional synthetically constructed ones, to create a total of over 400,000 claim-evidence pairs. Unlike previous resources, the examples in VitaminC are contrastive, i.e., they contain evidence pairs that are nearly identical in language and content, with the exception that one supports a given claim while the other does not. We show that training using this design increases robustness -- improving accuracy by 10% on adversarial fact verification and 6% on adversarial natural language inference (NLI). Moreover, the structure of VitaminC leads us to define additional tasks for fact-checking resources: tagging relevant words in the evidence for verifying the claim, identifying factual revisions, and providing automatic edits via factually consistent text generation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes