CLOct 23, 2021

PASTRIE: A Corpus of Prepositions Annotated with Supersense Tags in Reddit International English

arXiv:2110.12243v1990 citations
Originality Synthesis-oriented
AI Analysis

This provides a resource for studying cross-linguistic preposition usage in English, but it is incremental as it focuses on dataset creation without major methodological breakthroughs.

The authors introduced the PASTRIE corpus, a dataset with manually annotated preposition supersenses from Reddit users of four L1s, and analyzed distributional patterns and L1 influence on L2 preposition choices.

We present the Prepositions Annotated with Supersense Tags in Reddit International English ("PASTRIE") corpus, a new dataset containing manually annotated preposition supersenses of English data from presumed speakers of four L1s: English, French, German, and Spanish. The annotations are comprehensive, covering all preposition types and tokens in the sample. Along with the corpus, we provide analysis of distributional patterns across the included L1s and a discussion of the influence of L1s on L2 preposition choice.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes