CLFeb 6, 2025

Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes

arXiv:2502.04173v1h-index: 10ICAART
Originality Incremental advance
AI Analysis

This work addresses the problem of improving contextual relevance in word substitution for NLP researchers and practitioners, though it is incremental as it builds on existing masked token prediction techniques.

The paper tackles the lexical substitution task by proposing ConCat, a method that enhances contextual information in pre-trained language models to generate more contextually relevant word substitutes, showing effectiveness through quantitative evaluations and human preference analysis.

Lexical Substitution is the task of replacing a single word in a sentence with a similar one. This should ideally be one that is not necessarily only synonymous, but also fits well into the surrounding context of the target word, while preserving the sentence's grammatical structure. Recent advances in Lexical Substitution have leveraged the masked token prediction task of Pre-trained Language Models to generate replacements for a given word in a sentence. With this technique, we introduce ConCat, a simple augmented approach which utilizes the original sentence to bolster contextual information sent to the model. Compared to existing approaches, it proves to be very effective in guiding the model to make contextually relevant predictions for the target word. Our study includes a quantitative evaluation, measured via sentence similarity and task performance. In addition, we conduct a qualitative human analysis to validate that users prefer the substitutions proposed by our method, as opposed to previous methods. Finally, we test our approach on the prevailing benchmark for Lexical Substitution, CoInCo, revealing potential pitfalls of the benchmark. These insights serve as the foundation for a critical discussion on the way in which Lexical Substitution is evaluated.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes