CLMay 29, 2020

A Comparative Study of Lexical Substitution Approaches based on Neural Language Models

Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

arXiv:2006.00031v11.37 citations

Originality Incremental advance

AI Analysis

This incremental study addresses lexical substitution for NLP applications like data augmentation.

The paper compared neural language models for lexical substitution, showing that injecting target word information improves results, with BERT achieving the best performance at 0.62 F1 score.

Lexical substitution in context is an extremely powerful technology that can be used as a backbone of various NLP applications, such as word sense induction, lexical relation extraction, data augmentation, etc. In this paper, we present a large-scale comparative study of popular neural language and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, XLNet, applied to the task of lexical substitution. We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly, and compare several target injection methods. In addition, we provide analysis of the types of semantic relations between the target and substitutes generated by different models providing insights into what kind of words are really generated or given by annotators as substitutes.

View on arXiv PDF

Similar