Refining Implicit Argument Annotation for UCCA
This work addresses the challenge of ambiguity in machine text interpretation for NLP researchers, though it is incremental as it builds on existing UCCA frameworks.
The paper tackles the problem of coarse implicit argument annotation in natural language understanding by proposing a fine-grained typology with six categories, and demonstrates its application by creating a new dataset from part of the UCCA EWT corpus.
Predicate-argument structure analysis is a central component in meaning representations of text. The fact that some arguments are not explicitly mentioned in a sentence gives rise to ambiguity in language understanding, and renders it difficult for machines to interpret text correctly. However, only few resources represent implicit roles for NLU, and existing studies in NLP only make coarse distinctions between categories of arguments omitted from linguistic form. This paper proposes a typology for fine-grained implicit argument annotation on top of Universal Conceptual Cognitive Annotation's foundational layer. The proposed implicit argument categorisation is driven by theories of implicit role interpretation and consists of six types: Deictic, Generic, Genre-based, Type-identifiable, Non-specific, and Iterated-set. We exemplify our design by revisiting part of the UCCA EWT corpus, providing a new dataset annotated with the refinement layer, and making a comparative analysis with other schemes.