CLSep 27, 2023

Question answering using deep learning in low resource Indian language Marathi

Dhiraj Amin, Sharvari Govilkar, Sagar Kulkarni

arXiv:2309.15779v17 citationsh-index: 11

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited NLP resources for Marathi speakers, but it is incremental as it applies existing transformer methods to a new dataset.

The paper tackled question answering in low-resource Marathi by fine-tuning transformer models on a reading comprehension dataset, achieving best results with the MuRIL model at an EM score of 0.64 and F1 score of 0.74.

Precise answers are extracted from a text for a given input question in a question answering system. Marathi question answering system is created in recent studies by using ontology, rule base and machine learning based approaches. Recently transformer models and transfer learning approaches are used to solve question answering challenges. In this paper we investigate different transformer models for creating a reading comprehension-based Marathi question answering system. We have experimented on different pretrained Marathi language multilingual and monolingual models like Multilingual Representations for Indian Languages (MuRIL), MahaBERT, Indic Bidirectional Encoder Representations from Transformers (IndicBERT) and fine-tuned it on a Marathi reading comprehension-based data set. We got the best accuracy in a MuRIL multilingual model with an EM score of 0.64 and F1 score of 0.74 by fine tuning the model on the Marathi dataset.

View on arXiv PDF

Similar