Generating event descriptions under syntactic and semantic constraints
This work addresses the need for scalable lexical semantic annotation tools for linguists and NLP researchers, though it is incremental as it compares existing methods rather than introducing new ones.
The paper tackled the problem of generating event descriptions under syntactic and semantic constraints to support lexical semantic annotation, finding that manual generation by experts produces more natural, typical, and distinctive descriptions than automated methods like corpus sampling or language model sampling, but automated methods still yield sufficient quality for downstream use with robust analysis.
With the goal of supporting scalable lexical semantic annotation, analysis, and theorizing, we conduct a comprehensive evaluation of different methods for generating event descriptions under both syntactic constraints -- e.g. desired clause structure -- and semantic constraints -- e.g. desired verb sense. We compare three different methods -- (i) manual generation by experts; (ii) sampling from a corpus annotated for syntactic and semantic information; and (iii) sampling from a language model (LM) conditioned on syntactic and semantic information -- along three dimensions of the generated event descriptions: (a) naturalness, (b) typicality, and (c) distinctiveness. We find that all methods reliably produce natural, typical, and distinctive event descriptions, but that manual generation continues to produce event descriptions that are more natural, typical, and distinctive than the automated generation methods. We conclude that the automated methods we consider produce event descriptions of sufficient quality for use in downstream annotation and analysis insofar as the methods used for this annotation and analysis are robust to a small amount of degradation in the resulting event descriptions.