Automatic Spell Checker and Correction for Under-represented Spoken Languages: Case Study on Wolof
This work addresses the revitalization and preservation of Wolof, an Indigenous language in Africa, by providing computational tools, though it is incremental as it applies existing methods to a new language.
The paper tackled the problem of spell checking and correction for Wolof, an under-represented spoken language, by developing a tool that achieved 98.31% predictive accuracy and 93.33% suggestion accuracy using novel linguistic resources.
This paper presents a spell checker and correction tool specifically designed for Wolof, an under-represented spoken language in Africa. The proposed spell checker leverages a combination of a trie data structure, dynamic programming, and the weighted Levenshtein distance to generate suggestions for misspelled words. We created novel linguistic resources for Wolof, such as a lexicon and a corpus of misspelled words, using a semi-automatic approach that combines manual and automatic annotation methods. Despite the limited data available for the Wolof language, the spell checker's performance showed a predictive accuracy of 98.31% and a suggestion accuracy of 93.33%. Our primary focus remains the revitalization and preservation of Wolof as an Indigenous and spoken language in Africa, providing our efforts to develop novel linguistic resources. This work represents a valuable contribution to the growth of computational tools and resources for the Wolof language and provides a strong foundation for future studies in the automatic spell checking and correction field.