CLCONov 13, 2022

Quantifying syntax similarity with a polynomial representation of dependency trees

arXiv:2211.07005v18 citationsh-index: 78
Originality Synthesis-oriented
AI Analysis

This work addresses the need for precise syntax comparison in computational linguistics, particularly for multilingual analysis, though it appears incremental as it builds on existing dependency grammar frameworks.

The authors tackled the problem of quantifying syntax similarity by introducing a graph polynomial to represent dependency trees and a measure based on it, applying these methods to analyze sentences in Parallel Universal Dependencies treebanks for comparing translations and studying syntactic typology.

We introduce a graph polynomial that distinguishes tree structures to represent dependency grammar and a measure based on the polynomial representation to quantify syntax similarity. The polynomial encodes accurate and comprehensive information about the dependency structure and dependency relations of words in a sentence. We apply the polynomial-based methods to analyze sentences in the Parallel Universal Dependencies treebanks. Specifically, we compare the syntax of sentences and their translations in different languages, and we perform a syntactic typology study of available languages in the Parallel Universal Dependencies treebanks. We also demonstrate and discuss the potential of the methods in measuring syntax diversity of corpora.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes