CLMar 16

Learning Constituent Headedness

arXiv:2603.1475576.8h-index: 15
Predicted impact top 79% in CL · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the need for explicit headedness representation in syntactic analysis, improving parsing accuracy and cross-linguistic transfer, though it is incremental as it builds on existing annotation frameworks.

The paper tackled the problem of learning constituent headedness as a supervised prediction task using aligned constituency and dependency annotations, achieving near-ceiling intrinsic accuracy and outperforming rule-based methods on English and Chinese data.

Headedness is widely used as an organizing device in syntactic analysis, yet constituency treebanks rarely encode it explicitly and most processing pipelines recover it procedurally via percolation rules. We treat this notion of constituent headedness as an explicit representational layer and learn it as a supervised prediction task over aligned constituency and dependency annotations, inducing supervision by defining each constituent head as the dependency span head. On aligned English and Chinese data, the resulting models achieve near-ceiling intrinsic accuracy and substantially outperform Collins-style rule-based percolation. Predicted heads yield comparable parsing accuracy under head-driven binarization, consistent with the induced binary training targets being largely equivalent across head choices, while increasing the fidelity of deterministic constituency-to-dependency conversion and transferring across resources and languages under simple label-mapping interfaces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes