Dictionary-Based Concept Mining: An Application for Turkish
This work addresses concept extraction for Turkish, which is an incremental improvement as it adapts existing methods to a less-studied language.
The study tackled concept mining for Turkish, an agglutinative language, by using a dictionary-based method instead of WordNet, achieving a high success rate in extracting concepts from documents collected from various corpora.
In this study, a dictionary-based method is used to extract expressive concepts from documents. So far, there have been many studies concerning concept mining in English, but this area of study for Turkish, an agglutinative language, is still immature. We used dictionary instead of WordNet, a lexical database grouping words into synsets that is widely used for concept extraction. The dictionaries are rarely used in the domain of concept mining, but taking into account that dictionary entries have synonyms, hypernyms, hyponyms and other relationships in their meaning texts, the success rate has been high for determining concepts. This concept extraction method is implemented on documents, that are collected from different corpora.