CLJan 12, 2014

Dictionary-Based Concept Mining: An Application for Turkish

Cem Rıfkı Aydın, Ali Erkan, Tunga Güngör, Hidayet Takçı

arXiv:1401.2663v11 citations

Originality Synthesis-oriented

AI Analysis

This work addresses concept extraction for Turkish, which is an incremental improvement as it adapts existing methods to a less-studied language.

The study tackled concept mining for Turkish, an agglutinative language, by using a dictionary-based method instead of WordNet, achieving a high success rate in extracting concepts from documents collected from various corpora.

In this study, a dictionary-based method is used to extract expressive concepts from documents. So far, there have been many studies concerning concept mining in English, but this area of study for Turkish, an agglutinative language, is still immature. We used dictionary instead of WordNet, a lexical database grouping words into synsets that is widely used for concept extraction. The dictionaries are rarely used in the domain of concept mining, but taking into account that dictionary entries have synonyms, hypernyms, hyponyms and other relationships in their meaning texts, the success rate has been high for determining concepts. This concept extraction method is implemented on documents, that are collected from different corpora.

View on arXiv PDF

Similar