New Datasets and a Benchmark of Document Network Embedding Methods for Scientific Expert Finding
This work addresses the problem of expert finding for researchers and professionals in scientific domains, but it is incremental as it focuses on benchmarking existing methods on new datasets.
The paper tackled the challenge of finding scientific experts by proposing a benchmark that leverages data from citation networks and Q&A websites, comparing several algorithms and studying embedding methods for this task.
The scientific literature is growing faster than ever. Finding an expert in a particular scientific domain has never been as hard as today because of the increasing amount of publications and because of the ever growing diversity of expertise fields. To tackle this challenge, automatic expert finding algorithms rely on the vast scientific heterogeneous network to match textual queries with potential expert candidates. In this direction, document network embedding methods seem to be an ideal choice for building representations of the scientific literature. Citation and authorship links contain major complementary information to the textual content of the publications. In this paper, we propose a benchmark for expert finding in document networks by leveraging data extracted from a scientific citation network and three scientific question & answer websites. We compare the performances of several algorithms on these different sources of data and further study the applicability of embedding methods on an expert finding task.