DL CLDec 23, 2024

Recent Developments in Deep Learning-based Author Name Disambiguation

Francesca Cappelli, Giovanni Colavizza, Silvio Peroni

arXiv:2503.13448v17 citationsh-index: 2IRCDL

Originality Synthesis-oriented

AI Analysis

This is an incremental review that addresses the lack of recent surveys on deep learning techniques for author name disambiguation, primarily benefiting researchers and digital library developers.

The paper tackles the problem of author name disambiguation in digital libraries by providing a systematic review of deep learning-based methods from 2016 to 2024, finding that these methods enable data integration and hybrid learning approaches.

Author Name Disambiguation (AND) is a critical task for digital libraries aiming to link existing authors with their respective publications. Due to the lack of persistent identifiers used by researchers and the presence of intrinsic linguistic challenges, such as homonymy, the development of Deep Learning algorithms to address this issue has become widespread. Many AND deep learning methods have been developed, and surveys exist comparing the approaches in terms of techniques, complexity, performance. However, none explicitly addresses AND methods in the context of deep learning in the latest years (i.e. timeframe 2016-2024). In this paper, we provide a systematic review of state-of-the-art AND techniques based on deep learning, highlighting recent improvements, challenges, and open issues in the field. We find that DL methods have significantly impacted AND by enabling the integration of structured and unstructured data, and hybrid approaches effectively balance supervised and unsupervised learning.

View on arXiv PDF

Similar