CLJun 8, 2022

1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse Dictionary Task

Zhiyong Wang, Ge Zhang, Nineli Lashkarashvili

arXiv:2206.03702v131.7628 citationsh-index: 28

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of reverse dictionary lookup for multilingual natural language processing applications, but it is incremental, building on existing methods with specific optimizations.

The paper tackled the SemEval2022 Reverse Dictionary Task by mapping multilingual glosses to word embeddings, achieving the highest performance with an Elmobased monolingual model, while multilingual and multitask variants also showed competitive results.

This paper describes our system for the SemEval2022 task of matching dictionary glosses to word embeddings. We focus on the Reverse Dictionary Track of the competition, which maps multilingual glosses to reconstructed vector representations. More specifically, models convert the input of sentences to three types of embeddings: SGNS, Char, and Electra. We propose several experiments for applying neural network cells, general multilingual and multitask structures, and language-agnostic tricks to the task. We also provide comparisons over different types of word embeddings and ablation studies to suggest helpful strategies. Our initial transformer-based model achieves relatively low performance. However, trials on different retokenization methodologies indicate improved performance. Our proposed Elmobased monolingual model achieves the highest outcome, and its multitask, and multilingual varieties show competitive results as well.

View on arXiv PDF

Similar