CLOct 1, 2022

CGELBank: CGEL as a Framework for English Syntax Annotation

Stanford
arXiv:2210.00394v11 citationsh-index: 42
Originality Synthesis-oriented
AI Analysis

This work addresses the need for a more comprehensive syntactic framework for English syntax annotation, though it is incremental as it builds on existing treebanking efforts.

The paper tackled the problem of adapting the Cambridge Grammar of the English Language (CGEL) formalism for corpus annotation by introducing the CGELBank project, resulting in quantitative and qualitative comparisons with existing treebanks like UD and PTB, showing CGEL offers a good tradeoff between comprehensiveness and usability.

We introduce the syntactic formalism of the \textit{Cambridge Grammar of the English Language} (CGEL) to the world of treebanking through the CGELBank project. We discuss some issues in linguistic analysis that arose in adapting the formalism to corpus annotation, followed by quantitative and qualitative comparisons with parallel UD and PTB treebanks. We argue that CGEL provides a good tradeoff between comprehensiveness of analysis and usability for annotation, which motivates expanding the treebank with automatic conversion in the future.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes