A preliminary study of Croatian Language Syllable Networks
This provides incremental insights into language structure for computational linguistics researchers.
The authors analyzed Croatian syllable networks from Wikipedia and blog texts, finding they have much higher clustering coefficients than random networks and exhibit small-world properties, with similar characteristics to Portuguese and Chinese syllable networks.
This paper presents preliminary results of Croatian syllable networks analysis. Syllable network is a network in which nodes are syllables and links between them are constructed according to their connections within words. In this paper we analyze networks of syllables generated from texts collected from the Croatian Wikipedia and Blogs. As a main tool we use complex network analysis methods which provide mechanisms that can reveal new patterns in a language structure. We aim to show that syllable networks have much higher clustering coefficient in comparison to Erdös-Renyi random networks. The results indicate that Croatian syllable networks exhibit certain properties of a small world networks. Furthermore, we compared Croatian syllable networks with Portuguese and Chinese syllable networks and we showed that they have similar properties.