CLMay 11, 2023
Unicode Normalization and Grapheme Parsing of Indic LanguagesNazmuddoha Ansary, Quazi Adibur Rahman Adib, Tahsin Reasat et al.
Writing systems of Indic languages have orthographic syllables, also known as complex graphemes, as unique horizontal units. A prominent feature of these languages is these complex grapheme units that comprise consonants/consonant conjuncts, vowel diacritics, and consonant diacritics, which, together make a unique Language. Unicode-based writing schemes of these languages often disregard this feature of these languages and encode words as linear sequences of Unicode characters using an intricate scheme of connector characters and font interpreters. Due to this way of using a few dozen Unicode glyphs to write thousands of different unique glyphs (complex graphemes), there are serious ambiguities that lead to malformed words. In this paper, we are proposing two libraries: i) a normalizer for normalizing inconsistencies caused by a Unicode-based encoding scheme for Indic languages and ii) a grapheme parser for Abugida text. It deconstructs words into visually distinct orthographic syllables or complex graphemes and their constituents. Our proposed normalizer is a more efficient and effective tool than the previously used IndicNLP normalizer. Moreover, our parser and normalizer are also suitable tools for general Abugida text processing as they performed well in our robust word-based and NLP experiments. We report the pipeline for the scripts of 7 languages in this work and develop the framework for the integration of more scripts.
LGNov 22, 2021
Prediction Model for Mortality Analysis of Pregnant Women Affected With COVID-19Quazi Adibur Rahman Adib, Sidratul Tanzila Tasmi, Md. Shahriar Islam Bhuiyan et al.
COVID-19 pandemic is an ongoing global pandemic which has caused unprecedented disruptions in the public health sector and global economy. The virus, SARS-CoV-2 is responsible for the rapid transmission of coronavirus disease. Due to its contagious nature, the virus can easily infect an unprotected and exposed individual from mild to severe symptoms. The study of the virus effects on pregnant mothers and neonatal is now a concerning issue globally among civilians and public health workers considering how the virus will affect the mother and the neonates health. This paper aims to develop a predictive model to estimate the possibility of death for a COVID-diagnosed mother based on documented symptoms: dyspnea, cough, rhinorrhea, arthralgia, and the diagnosis of pneumonia. The machine learning models that have been used in our study are support vector machine, decision tree, random forest, gradient boosting, and artificial neural network. The models have provided impressive results and can accurately predict the mortality of pregnant mothers with a given input.The precision rate for 3 models(ANN, Gradient Boost, Random Forest) is 100% The highest accuracy score(Gradient Boosting,ANN) is 95%,highest recall(Support Vector Machine) is 92.75% and highest f1 score(Gradient Boosting,ANN) is 94.66%. Due to the accuracy of the model, pregnant mother can expect immediate medical treatment based on their possibility of death due to the virus. The model can be utilized by health workers globally to list down emergency patients, which can ultimately reduce the death rate of COVID-19 diagnosed pregnant mothers.