CLOct 6, 2020

Modeling Preconditions in Text with a Crowd-sourced Dataset

arXiv:2010.02429v3995 citations
AI Analysis

This addresses the lack of large-scale labeled data for preconditions in natural language processing, enabling improved reasoning about event connections, though it is incremental as it builds on existing annotation efforts.

The paper tackled the problem of modeling preconditions in text by introducing PeKo, a crowd-sourced dataset that is an order of magnitude larger than prior annotations, and showed that fine-tuning a language model on it yields better conditional relations than training on raw text or temporally-ordered corpora.

Preconditions provide a form of logical connection between events that explains why some events occur together and information that is complementary to the more widely studied relations such as causation, temporal ordering, entailment, and discourse relations. Modeling preconditions in text has been hampered in part due to the lack of large scale labeled data grounded in text. This paper introduces PeKo, a crowd-sourced annotation of preconditions between event pairs in newswire, an order of magnitude larger than prior text annotations. To complement this new corpus, we also introduce two challenge tasks aimed at modeling preconditions: (i) Precondition Identification -- a standard classification task defined over pairs of event mentions, and (ii) Precondition Generation -- a generative task aimed at testing a more general ability to reason about a given event. Evaluation on both tasks shows that modeling preconditions is challenging even for today's large language models (LM). This suggests that precondition knowledge is not easily accessible in LM-derived representations alone. Our generation results show that fine-tuning an LM on PeKo yields better conditional relations than when trained on raw text or temporally-ordered corpora.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes