IRCLJan 6, 2015

Arabic Text Categorization Algorithm using Vector Evaluation Method

arXiv:1501.01318v122 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for better information retrieval in Arabic text, an under-researched field, but the approach appears incremental as it applies vector evaluation specifically to Arabic documents.

The paper tackles the problem of improving classification accuracy in Arabic text categorization by proposing a new method using vector evaluation, which calculates word weights to determine document keywords and matches them with corpus categories to assign the best category.

Text categorization is the process of grouping documents into categories based on their contents. This process is important to make information retrieval easier, and it became more important due to the huge textual information available online. The main problem in text categorization is how to improve the classification accuracy. Although Arabic text categorization is a new promising field, there are a few researches in this field. This paper proposes a new method for Arabic text categorization using vector evaluation. The proposed method uses a categorized Arabic documents corpus, and then the weights of the tested document's words are calculated to determine the document keywords which will be compared with the keywords of the corpus categorizes to determine the tested document's best category.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes