Mapping Researcher Activity based on Publication Data by means of Transformers
This work addresses the need for automated analysis of publication data to identify research trends and author collaborations, but it is incremental as it applies existing NLP methods to a new dataset.
The paper tackled the problem of mapping researcher activity by using BERT to encode and cluster research papers from a local database, resulting in a landscape view of scientific topics and a similarity metric between authors based on their paper similarities.
Modern performance on several natural language processing (NLP) tasks has been enhanced thanks to the Transformer-based pre-trained language model BERT. We employ this concept to investigate a local publication database. Research papers are encoded and clustered to form a landscape view of the scientific topics, in which research is active. Authors working on similar topics can be identified by calculating the similarity between their papers. Based on this, we define a similarity metric between authors. Additionally we introduce the concept of self-similarity to indicate the topical variety of authors.