CLLGMay 3, 2023

Approximating CKY with Transformers

arXiv:2305.02386v2131 citations
Originality Incremental advance
AI Analysis

This work addresses the computational inefficiency of traditional parsing algorithms for NLP practitioners, but it is incremental as it builds on existing transformer methods.

The paper tackled the problem of approximating the CKY algorithm for constituency parsing using transformer models to avoid its cubic time complexity, achieving competitive or better performance on standard benchmarks while being faster, though performance declines with more ambiguous grammars.

We investigate the ability of transformer models to approximate the CKY algorithm, using them to directly predict a sentence's parse and thus avoid the CKY algorithm's cubic dependence on sentence length. We find that on standard constituency parsing benchmarks this approach achieves competitive or better performance than comparable parsers that make use of CKY, while being faster. We also evaluate the viability of this approach for parsing under \textit{random} PCFGs. Here we find that performance declines as the grammar becomes more ambiguous, suggesting that the transformer is not fully capturing the CKY computation. However, we also find that incorporating additional inductive bias is helpful, and we propose a novel approach that makes use of gradients with respect to chart representations in predicting the parse, in analogy with the CKY algorithm being a subgradient of a partition function variant with respect to the chart.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes