CLAIMar 3, 2021

An Attention Based Neural Network for Code Switching Detection: English & Roman Urdu

arXiv:2103.02252v1
Originality Incremental advance
AI Analysis

This addresses language identification in code-switched data for multilingual communication, but it is incremental as it applies an existing attention mechanism to a specific low-resource language pair.

The paper tackled code-switching detection between English and low-resource Roman Urdu by proposing a recurrent neural network with an attention model, which improved precision and accuracy compared to state-of-the-art models like Hidden Markov Models and Bidirectional LSTM.

Code-switching is a common phenomenon among people with diverse lingual background and is widely used on the internet for communication purposes. In this paper, we present a Recurrent Neural Network combined with the Attention Model for Language Identification in Code-Switched Data in English and low resource Roman Urdu. The attention model enables the architecture to learn the important features of the languages hence classifying the code switched data. We demonstrated our approach by comparing the results with state of the art models i.e. Hidden Markov Models, Conditional Random Field and Bidirectional LSTM. The models evaluation, using confusion matrix metrics, showed that the attention mechanism provides improved the precision and accuracy as compared to the other models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes