CLMar 15, 2021

Generating CCG Categories

arXiv:2103.08139v15 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in CCG parsing by improving robustness for infrequent categories and out-of-domain texts, though it is incremental as it builds on existing supertagging methods.

The paper tackles the problem of CCG supertagging by generating categories as sequences of atomic tags instead of using multi-class classification, achieving state-of-the-art results with 95.5% tagging accuracy and 89.8% labeled F1 score on CCGBank.

Previous CCG supertaggers usually predict categories using multi-class classification. Despite their simplicity, internal structures of categories are usually ignored. The rich semantics inside these structures may help us to better handle relations among categories and bring more robustness into existing supertaggers. In this work, we propose to generate categories rather than classify them: each category is decomposed into a sequence of smaller atomic tags, and the tagger aims to generate the correct sequence. We show that with this finer view on categories, annotations of different categories could be shared and interactions with sentence contexts could be enhanced. The proposed category generator is able to achieve state-of-the-art tagging (95.5% accuracy) and parsing (89.8% labeled F1) performances on the standard CCGBank. Furthermore, its performances on infrequent (even unseen) categories, out-of-domain texts and low resource language give promising results on introducing generation models to the general CCG analyses.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes