ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for Abstract Word Prediction
This work addresses a specific NLP challenge in abstract meaning comprehension for the SemEval competition, presenting an incremental improvement using existing methods on new data.
The paper tackled the problem of predicting missing abstract words in statements for SemEval-2021 Task 4 by fine-tuning BERT and ALBERT models, using an ensemble for two subtasks and ALBERT alone for another, achieving best results with a masked language modeling approach.
This paper describes our system for Task 4 of SemEval-2021: Reading Comprehension of Abstract Meaning (ReCAM). We participated in all subtasks where the main goal was to predict an abstract word missing from a statement. We fine-tuned the pre-trained masked language models namely BERT and ALBERT and used an Ensemble of these as our submitted system on Subtask 1 (ReCAM-Imperceptibility) and Subtask 2 (ReCAM-Nonspecificity). For Subtask 3 (ReCAM-Intersection), we submitted the ALBERT model as it gives the best results. We tried multiple approaches and found that Masked Language Modeling(MLM) based approach works the best.