CLMar 27, 2024

Evaluation of Semantic Search and its Role in Retrieved-Augmented-Generation (RAG) for Arabic Language

Ali Mahboub, Muhy Eddin Za'ter, Bashar Al-Rfooh, Yazan Estaitia, Adnan Jaljuli, Asma Hakouz

arXiv:2403.18350v24.28 citationsh-index: 6

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of evaluating semantic search for Arabic language, which is incremental as it builds on existing RAG methods but focuses on a specific language domain.

The paper tackles the lack of standard benchmarks for semantic search in Arabic by establishing a straightforward yet potent benchmark, and evaluates its effectiveness within the framework of retrieval-augmented generation (RAG).

The latest advancements in machine learning and deep learning have brought forth the concept of semantic similarity, which has proven immensely beneficial in multiple applications and has largely replaced keyword search. However, evaluating semantic similarity and conducting searches for a specific query across various documents continue to be a complicated task. This complexity is due to the multifaceted nature of the task, the lack of standard benchmarks, whereas these challenges are further amplified for Arabic language. This paper endeavors to establish a straightforward yet potent benchmark for semantic search in Arabic. Moreover, to precisely evaluate the effectiveness of these metrics and the dataset, we conduct our assessment of semantic search within the framework of retrieval augmented generation (RAG).

View on arXiv PDF

Similar