CLAILGJun 2, 2023

Unsupervised Paraphrasing of Multiword Expressions

arXiv:2306.01443v1223 citationsh-index: 69
Originality Incremental advance
AI Analysis

This work addresses the challenge of handling idiomatic language in NLP applications, offering a resource-light solution that is incremental but competitive with supervised methods.

The paper tackles the problem of paraphrasing multiword expressions in context using an unsupervised approach that relies solely on monolingual corpus data and pre-trained language models without fine-tuning or external resources. It demonstrates that this method outperforms all unsupervised systems and rivals supervised ones on the SemEval 2022 idiomatic semantic text similarity task.

We propose an unsupervised approach to paraphrasing multiword expressions (MWEs) in context. Our model employs only monolingual corpus data and pre-trained language models (without fine-tuning), and does not make use of any external resources such as dictionaries. We evaluate our method on the SemEval 2022 idiomatic semantic text similarity task, and show that it outperforms all unsupervised systems and rivals supervised systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes