IR CLOct 28, 2024

Semantic Search Evaluation

Chujie Zheng, Jeffrey Wang, Shuqian Albee Zhang, Anand Kishore, Siddharth Singh

arXiv:2410.21549v12.22 citationsh-index: 1

Originality Incremental advance

AI Analysis

This work addresses the need for better evaluation metrics in semantic search systems, particularly for developers and researchers aiming to improve relevance, though it appears incremental as it builds on existing AI tools like GPT 3.5.

The authors tackled the problem of evaluating content search systems by proposing a novel method that measures semantic match between queries and results, introducing an 'on-topic rate' metric to quantify relevance, and achieving this through a pipeline involving GPT 3.5 for automated assessment.

We propose a novel method for evaluating the performance of a content search system that measures the semantic match between a query and the results returned by the search system. We introduce a metric called "on-topic rate" to measure the percentage of results that are relevant to the query. To achieve this, we design a pipeline that defines a golden query set, retrieves the top K results for each query, and sends calls to GPT 3.5 with formulated prompts. Our semantic evaluation pipeline helps identify common failure patterns and goals against the metric for relevance improvements.

View on arXiv PDF

Similar