Evolution of the Modern Phase of Written Bangla: A Statistical Study
This work addresses the need for statistical analysis of language evolution in Bangla, providing foundational insights for linguists and computational linguists, though it is incremental in applying established methods to a new language context.
The paper tackled the problem of quantifying the evolution of written Bangla by analyzing character, syllable, morpheme, and word-level features across classical, newspaper, and blog corpora, finding significant changes in word length in characters but little difference in other features.
Active languages such as Bangla (or Bengali) evolve over time due to a variety of social, cultural, economic, and political issues. In this paper, we analyze the change in the written form of the modern phase of Bangla quantitatively in terms of character-level, syllable-level, morpheme-level and word-level features. We collect three different types of corpora---classical, newspapers and blogs---and test whether the differences in their features are statistically significant. Results suggest that there are significant changes in the length of a word when measured in terms of characters, but there is not much difference in usage of different characters, syllables and morphemes in a word or of different words in a sentence. To the best of our knowledge, this is the first work on Bangla of this kind.