Initial Comparison of Linguistic Networks Measures for Parallel Texts
This work provides incremental insights into linguistic structures for computational linguistics and network science researchers.
The paper analyzed syllable networks from Croatian texts, finding that they have a higher clustering coefficient than random networks and exhibit small-world properties, with similar patterns observed in Portuguese and Chinese syllable networks.
This paper presents preliminary results of Croatian syllable networks analysis. Syllable network is a network in which nodes are syllables and links between them are constructed according to their connections within words. In this paper we analyze networks of syllables generated from texts collected from the Croatian Wikipedia and Blogs. As a main tool we use complex network analysis methods which provide mechanisms that can reveal new patterns in a language structure. We aim to show that syllable networks have much higher clustering coefficient in comparison to Erdös-Renyi random networks. The results indicate that Croatian syllable networks exhibit certain properties of a small world networks. Furthermore, we compared Croatian syllable networks with Portuguese and Chinese syllable networks and we showed that they have similar properties.