Semi-automatic methods for adding words to the dictionary of VepKar corpus based on inflectional rules extracted from Wiktionary
This work addresses the challenge of resource scarcity for low-resource languages like Veps and Karelian by incrementally improving dictionary coverage through automated rule extraction.
The authors tackled the problem of expanding the dictionary for the VepKar corpus by developing semi-automatic methods that use inflectional rules from English Wiktionary to generate word forms for Veps verbs and nominals, resulting in a technique that leverages Wiktionary templates to construct inflection tables.
The article describes a technique for using English Wiktionary inflection tables for generating word forms for Veps verbs and nominals in the Open corpus of Veps and Karelian languages. The information concerning Karelian and Veps Wiktionary entries with inflection tables is given. The operating principle of the Wiktionary static and dynamic templates is explained with the use of the jogi (river) dictionary entry as an example. The method of constructing the inflection table in the dictionary of the VepKar corpus according to the data of the dynamic template of the English Wiktionary is presented.