CLAILGJul 12, 2021

End-to-End Natural Language Understanding Pipeline for Bangla Conversational Agents

arXiv:2107.05541v61 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of building chatbots for Bangla speakers, but it is incremental as it adapts existing tools like Rasa and fastText to a new language context.

The paper tackles the lack of support for low-resource languages like Bangla and Bangla transliteration in conversational agents by developing an end-to-end natural language understanding pipeline, achieving an F1-score of 80% for intent classification and entity extraction.

Chatbots are intelligent software built to be used as a replacement for human interaction. Existing studies typically do not provide enough support for low-resource languages like Bangla. Due to the increasing popularity of social media, we can also see the rise of interactions in Bangla transliteration (mostly in English) among the native Bangla speakers. In this paper, we propose a novel approach to build a Bangla chatbot aimed to be used as a business assistant which can communicate in low-resource languages like Bangla and Bangla Transliteration in English with high confidence consistently. Since annotated data was not available for this purpose, we had to work on the whole machine learning life cycle (data preparation, machine learning modeling, and model deployment) using Rasa Open Source Framework, fastText embeddings, Polyglot embeddings, Flask, and other systems as building blocks. While working with the skewed annotated dataset, we try out different components and pipelines to evaluate which works best and provide possible reasoning behind the observed results. Finally, we present a pipeline for intent classification and entity extraction which achieves reasonable performance (accuracy: 83.02%, precision: 80.82%, recall: 83.02%, F1-score: 80%).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes