IR AI DLDec 10, 2024

Benchmark for Evaluation and Analysis of Citation Recommendation Models

arXiv:2412.07713v1h-index: 1

Originality Synthesis-oriented

AI Analysis

This provides a common platform for researchers and practitioners in citation recommendation to compare models effectively, though it is incremental as it standardizes existing practices rather than introducing new methods.

The paper tackles the challenge of inconsistent evaluation in citation recommendation systems by proposing a standardized benchmark with common datasets and metrics, aiming to enable consistent assessment and comparison of models.

Citation recommendation systems have attracted much academic interest, resulting in many studies and implementations. These systems help authors automatically generate proper citations by suggesting relevant references based on the text they have written. However, the methods used in citation recommendation differ across various studies and implementations. Some approaches focus on the overall content of papers, while others consider the context of the citation text. Additionally, the datasets used in these studies include different aspects of papers, such as metadata, citation context, or even the full text of the paper in various formats and structures. The diversity in models, datasets, and evaluation metrics makes it challenging to assess and compare citation recommendation methods effectively. To address this issue, a standardized dataset and evaluation metrics are needed to evaluate these models consistently. Therefore, we propose developing a benchmark specifically designed to analyze and compare citation recommendation models. This benchmark will evaluate the performance of models on different features of the citation context and provide a comprehensive evaluation of the models across all these tasks, presenting the results in a standardized way. By creating a benchmark with standardized evaluation metrics, researchers and practitioners in the field of citation recommendation will have a common platform to assess and compare different models. This will enable meaningful comparisons and help identify promising approaches for further research and development in the field.

View on arXiv PDF

Similar