CLOct 23, 2018

Object-oriented lexical encoding of multiword expressions: Short and sweet

arXiv:1810.09947v14 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of efficient lexical encoding for linguists and NLP researchers, but it appears incremental as it builds on existing frameworks and resources.

The paper tackles the challenge of non-redundant lexical encoding for multiword expressions (MWEs) by proposing a proof-of-concept using the eXtensible MetaGrammar (XMG) framework, resulting in an MWE-aware FrenchTAG grammar and evaluating factorization gain on an annotated corpus dataset.

Multiword expressions (MWEs) exhibit both regular and idiosyncratic properties. Their idiosyncrasy requires lexical encoding in parallel with their component words. Their (at times intricate) regularity, on the other hand, calls for means of flexible factorization to avoid redundant descriptions of shared properties. However, so far, non-redundant general-purpose lexical encoding of MWEs has not received a satisfactory solution. We offer a proof of concept that this challenge might be effectively addressed within eXtensible MetaGrammar (XMG), an object-oriented metagrammar framework. We first make an existing metagrammatical resource, the FrenchTAG grammar, MWE-aware. We then evaluate the factorization gain during incremental implementation with XMG on a dataset extracted from an MWE-annotated reference corpus.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes