CL IR LG MLOct 9, 2020

Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models

Pierangelo Lombardo, Alessio Boiardi, Luca Colombo, Angelo Schiavone, Nicolò Tamagnone

arXiv:2010.04486v131.1995 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of evaluating domain-specific semantic models for applications like content-based recommenders, but it is incremental as it builds on existing ranking correlation coefficients and pairwise comparison methods.

The paper tackles the need for domain-specific evaluation datasets for semantic models, particularly focusing on top-rank accuracy, by introducing a protocol for constructing relatedness-based datasets optimized for top-rank evaluation and defining appropriate metrics to assess model performance.

The growth of domain-specific applications of semantic models, boosted by the recent achievements of unsupervised embedding learning algorithms, demands domain-specific evaluation datasets. In many cases, content-based recommenders being a prime example, these models are required to rank words or texts according to their semantic relatedness to a given concept, with particular focus on top ranks. In this work, we give a threefold contribution to address these requirements: (i) we define a protocol for the construction, based on adaptive pairwise comparisons, of a relatedness-based evaluation dataset tailored on the available resources and optimized to be particularly accurate in top-rank evaluation; (ii) we define appropriate metrics, extensions of well-known ranking correlation coefficients, to evaluate a semantic model via the aforementioned dataset by taking into account the greater significance of top ranks. Finally, (iii) we define a stochastic transitivity model to simulate semantic-driven pairwise comparisons, which confirms the effectiveness of the proposed dataset construction protocol.

View on arXiv PDF Code

Similar