IR CL LG NIOct 15, 2024

Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical Specifications

Thaina Saraiva, Marco Sousa, Pedro Vieira, António Rodrigues

arXiv:2410.19790v15.55 citationsh-index: 8WCNC

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of efficient information retrieval for telecom engineers and researchers, but it is incremental as it builds on existing retrieval and QA methods with a new dataset.

This paper tackles the problem of retrieving technical information from 3GPP telecom documents by proposing a QA system and introducing a hybrid dataset, Telco-DPR, with synthetic question/answer pairs. The results show that the Dense Hierarchical Retrieval (DHR) model outperforms traditional methods, achieving a Top-10 accuracy of 86.2%, and the QA system using RAG and GPT-4 improves answer accuracy by 14% over a previous benchmark.

This paper proposes a Question-Answering (QA) system for the telecom domain using 3rd Generation Partnership Project (3GPP) technical documents. Alongside, a hybrid dataset, Telco-DPR, which consists of a curated 3GPP corpus in a hybrid format, combining text and tables, is presented. Additionally, the dataset includes a set of synthetic question/answer pairs designed to evaluate the retrieval performance of QA systems on this type of data. The retrieval models, including the sparse model, Best Matching 25 (BM25), as well as dense models, such as Dense Passage Retriever (DPR) and Dense Hierarchical Retrieval (DHR), are evaluated and compared using top-K accuracy and Mean Reciprocal Rank (MRR). The results show that DHR, a retriever model utilising hierarchical passage selection through fine-tuning at both the document and passage levels, outperforms traditional methods in retrieving relevant technical information, achieving a Top-10 accuracy of 86.2%. Additionally, the Retriever-Augmented Generation (RAG) technique, used in the proposed QA system, is evaluated to demonstrate the benefits of using the hybrid dataset and the DHR. The proposed QA system, using the developed RAG model and the Generative Pretrained Transformer (GPT)-4, achieves a 14% improvement in answer accuracy, when compared to a previous benchmark on the same dataset.

View on arXiv PDF

Similar