MULTISEM at SemEval-2020 Task 3: Fine-tuning BERT for Lexical Meaning
This work addresses lexical meaning disambiguation for natural language processing applications, but it is incremental as it builds on existing BERT fine-tuning approaches.
The paper tackled the problem of graded word similarity in context by fine-tuning BERT models with semantic knowledge, achieving third and fourth positions in English subtasks but lower mid-ranked results for Finnish due to data limitations.
We present the MULTISEM systems submitted to SemEval 2020 Task 3: Graded Word Similarity in Context (GWSC). We experiment with injecting semantic knowledge into pre-trained BERT models through fine-tuning on lexical semantic tasks related to GWSC. We use existing semantically annotated datasets and propose to approximate similarity through automatically generated lexical substitutes in context. We participate in both GWSC subtasks and address two languages, English and Finnish. Our best English models occupy the third and fourth positions in the ranking for the two subtasks. Performance is lower for the Finnish models which are mid-ranked in the respective subtasks, highlighting the important role of data availability for fine-tuning.