CLMay 27, 2022

HiJoNLP at SemEval-2022 Task 2: Detecting Idiomaticity of Multiword Expressions using Multilingual Pretrained Language Models

arXiv:2205.13708v1627 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of idiomaticity detection in natural language processing, which is important for improving machine understanding of language nuances, but it is incremental as it builds on existing multilingual models without introducing a new paradigm.

The paper tackled the problem of detecting idiomaticity in multiword expressions using multilingual pretrained language models, finding that larger models generally improve performance but higher layers do not consistently enhance results, and rich-resource languages outperform others in multilingual scenarios.

This paper describes an approach to detect idiomaticity only from the contextualized representation of a MWE over multilingual pretrained language models. Our experiments find that larger models are usually more effective in idiomaticity detection. However, using a higher layer of the model may not guarantee a better performance. In multilingual scenarios, the convergence of different languages are not consistent and rich-resource languages have big advantages over other languages.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes