SIIRApr 14, 2020

Author Name Disambiguation in Bibliographic Databases: A Survey

arXiv:2004.06391v112 citations
AI Analysis

This is an incremental survey that organizes existing knowledge for researchers in information systems and bibliometrics.

The paper surveys the problem of Author Name Disambiguation in bibliographic databases, which involves clustering citations to identify authors, and provides a five-step framework and categorization of methods without presenting new experimental results.

Entity resolution is a challenging and hot research area in the field of Information Systems since last decade. Author Name Disambiguation (AND) in Bibliographic Databases (BD) like DBLP , Citeseer , and Scopus is a specialized field of entity resolution. Given many citations of underlying authors, the AND task is to find which citations belong to the same author. In this survey, we start with three basic AND problems, followed by need for solution and challenges. A generic, five-step framework is provided for handling AND issues. These steps are; (1) Preparation of dataset (2) Selection of publication attributes (3) Selection of similarity metrics (4) Selection of models and (5) Clustering Performance evaluation. Categorization and elaboration of similarity metrics and methods are also provided. Finally, future directions and recommendations are given for this dynamic area of research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes