CLMay 13, 2025

Reassessing Graph Linearization for Sequence-to-sequence AMR Parsing: On the Advantages and Limitations of Triple-Based Encoding

arXiv:2505.08504v1h-index: 13The Sixth Workshop on Insights from Negative Results in NLP
Originality Synthesis-oriented
AI Analysis

This work addresses graph linearization issues for AMR parsing researchers, but it is incremental as it compares existing methods without achieving superior results.

The paper tackled the problem of linearizing AMR graphs for sequence-to-sequence parsing by proposing a triple-based encoding to address limitations of the standard Penman method, such as distant node placement and doubled relation types, but found that triple encoding still underperforms compared to Penman's concise representation.

Sequence-to-sequence models are widely used to train Abstract Meaning Representation (Banarescu et al., 2013, AMR) parsers. To train such models, AMR graphs have to be linearized into a one-line text format. While Penman encoding is typically used for this purpose, we argue that it has limitations: (1) for deep graphs, some closely related nodes are located far apart in the linearized text (2) Penman's tree-based encoding necessitates inverse roles to handle node re-entrancy, doubling the number of relation types to predict. To address these issues, we propose a triple-based linearization method and compare its efficiency with Penman linearization. Although triples are well suited to represent a graph, our results suggest room for improvement in triple encoding to better compete with Penman's concise and explicit representation of a nested graph structure.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes