CLJan 17, 2023

The Recent Advances in Automatic Term Extraction: A survey

Hanh Thi Hong Tran, Matej Martinc, Jaya Caporusso, Antoine Doucet, Senja Pollak

arXiv:2301.06767v12.122 citationsh-index: 16

Originality Synthesis-oriented

AI Analysis

It provides a comprehensive overview for researchers and practitioners in NLP, but it is incremental as it synthesizes existing work rather than introducing new methods.

This survey addresses the lack of systematic reviews on neural approaches to automatic term extraction (ATE), focusing on Transformer-based models and comparing them with traditional feature engineering and non-neural methods.

Automatic term extraction (ATE) is a Natural Language Processing (NLP) task that eases the effort of manually identifying terms from domain-specific corpora by providing a list of candidate terms. As units of knowledge in a specific field of expertise, extracted terms are not only beneficial for several terminographical tasks, but also support and improve several complex downstream tasks, e.g., information retrieval, machine translation, topic detection, and sentiment analysis. ATE systems, along with annotated datasets, have been studied and developed widely for decades, but recently we observed a surge in novel neural systems for the task at hand. Despite a large amount of new research on ATE, systematic survey studies covering novel neural approaches are lacking. We present a comprehensive survey of deep learning-based approaches to ATE, with a focus on Transformer-based neural models. The study also offers a comparison between these systems and previous ATE approaches, which were based on feature engineering and non-neural supervised learning algorithms.

View on arXiv PDF

Similar