CLApr 23, 2018

Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation

arXiv:1804.08207v21180 citations
AI Analysis

This provides a resource for researchers to assess how well sentence representations capture distinct reasoning types, though it is incremental as it repurposes existing data.

The authors tackled the problem of evaluating sentence representations by creating a large-scale collection of diverse natural language inference (NLI) datasets, resulting in over half a million labeled pairs from 13 existing datasets recast into a common structure.

We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. We refer to our collection as the DNC: Diverse Natural Language Inference Collection. The DNC is available online at https://www.decomp.net, and will grow over time as additional resources are recast and added from novel sources.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes