CLApr 17, 2021

Syntactic structures and the general Markov models

arXiv:2104.08462v31 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the validation of evolutionary models for syntactic data, which is incremental as it applies existing methods to a specific domain in linguistics.

The paper tackles the problem of assessing phylogenetic signal in syntactic structures data by testing its consistency with general Markov models and an infinite sites evolutionary model, comparing derived phylogenetic trees against established linguistic trees.

We study phylogenetic signal present in syntactic information by considering the syntactic structures data from Longobardi (2017b), Collins (2010), Ceolin et al. (2020) and Koopman (2011). Focusing first on the general Markov models, we explore how well the the syntactic structures data conform to the hypothesis required by these models. We do this by comparing derived phylogenetic trees against trees agreed on by the linguistics community. We then interpret the methods of Ceolin et al. (2020) as an infinite sites evolutionary model and compare the consistency of the data with this alternative. The ideas and methods discussed in the present paper are more generally applicable than to the specific setting of syntactic structures, and can be used in other contexts, when analyzing consistency of data with against hypothesized evolutionary models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes