HIT at SemEval-2022 Task 2: Pre-trained Language Model for Idioms Detection
This addresses the challenge of idiom detection in natural language processing, which is important for improving language understanding in AI systems, but the approach is incremental as it applies an existing method to a specific task.
The paper tackled the problem of detecting idiomatic usage of multi-word expressions in sentences, where the same expressions can have literal or idiomatic meanings, and reported results using a pre-trained language model for context-aware embeddings.
The same multi-word expressions may have different meanings in different sentences. They can be mainly divided into two categories, which are literal meaning and idiomatic meaning. Non-contextual-based methods perform poorly on this problem, and we need contextual embedding to understand the idiomatic meaning of multi-word expressions correctly. We use a pre-trained language model, which can provide a context-aware sentence embedding, to detect whether multi-word expression in the sentence is idiomatic usage.