Information retrieval system for silte language using BM25 weighting
This work addresses the problem of information retrieval for users of the Silte language, which is an incremental contribution to language-specific information retrieval.
This paper develops a probabilistic information retrieval system for the Silte language, addressing the growing challenge of accessing relevant information from unstructured digital Silte text documents. The system includes indexing and searching modules, incorporating text operations like tokenization, stemming, stop word removal, and synonym handling.
The main aim of an information retrieval system is to extract appropriate information from an enormous collection of data based on users need. The basic concept of the information retrieval system is that when a user sends out a query, the system would try to generate a list of related documents ranked in order, according to their degree of relevance. Digital unstructured Silte text documents increase from time to time. The growth of digital text information makes the utilization and access of the right information difficult. Thus, developing an information retrieval system for Silte language allows searching and retrieving relevant documents that satisfy information need of users. In this research, we design probabilistic information retrieval system for Silte language. The system has both indexing and searching part was created. In these modules, different text operations such as tokenization, stemming, stop word removal and synonym is included.