IRMay 8, 2012

Indexing of Arabic documents automatically based on lexical analysis

Abdulrahman Al Molijy, Ismail Hmeidi, Izzat Alsmadi

arXiv:1205.1602v113 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient information processing in Arabic document retrieval, though it appears incremental as it applies existing text analysis methods to a specific language domain.

The authors tackled the problem of automatically indexing Arabic books by using text summarization and abstraction to extract main topics, achieving results that effectively replace manual indexing efforts.

The continuous information explosion through the Internet and all information sources makes it necessary to perform all information processing activities automatically in quick and reliable manners. In this paper, we proposed and implemented a method to automatically create and Index for books written in Arabic language. The process depends largely on text summarization and abstraction processes to collect main topics and statements in the book. The process is developed in terms of accuracy and performance and results showed that this process can effectively replace the effort of manually indexing books and document, a process that can be very useful in all information processing and retrieval applications.

View on arXiv PDF

Similar