CLMay 13, 2014

Phonetic based SoundEx & ShapeEx algorithm for Sindhi Spell Checker System

arXiv:1405.3033v119 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of spell checking for Sindhi language users, which is incremental as it applies existing phonetic and shape-matching methods to a new language.

The paper tackles the problem of developing a spell checker for Sindhi language, which had not been done before, by proposing a novel combinational phonetic algorithm that blends SoundEx and ShapeEx algorithms to generate accurate suggestion lists for misspelled words, achieving increased accuracy and efficiency.

This paper presents a novel combinational phonetic algorithm for Sindhi Language, to be used in developing Sindhi Spell Checker which has yet not been developed prior to this work. The compound textual forms and glyphs of Sindhi language presents a substantial challenge for developing Sindhi spell checker system and generating similar suggestion list for misspelled words. In order to implement such a system, phonetic based Sindhi language rules and patterns must be considered into account for increasing the accuracy and efficiency. The proposed system is developed with a blend between Phonetic based SoundEx algorithm and ShapeEx algorithm for pattern or glyph matching, generating accurate and efficient suggestion list for incorrect or misspelled Sindhi words. A table of phonetically similar sounding Sindhi characters for SoundEx algorithm is also generated along with another table containing similar glyph or shape based character groups for ShapeEx algorithm. Both these are first ever attempt of any such type of categorization and representation for Sindhi Language.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes