CLJan 15, 2024

Milestones in Bengali Sentiment Analysis leveraging Transformer-models: Fundamentals, Challenges and Future Directions

Saptarshi Sengupta, Shreya Ghosh, Prasenjit Mitra, Tarikul Islam Tamiti

arXiv:2401.07847v11.94 citationsh-index: 5

Originality Synthesis-oriented

AI Analysis

It addresses the lack of sentiment analysis technology for Bengali, a resource-poor language spoken by about 300 million people, but is incremental as it primarily surveys existing work.

The paper reviews the state-of-the-art in Bengali sentiment analysis using Transformer models, highlighting challenges like limited datasets and language-specific nuances, and suggests future directions to address these gaps.

Sentiment Analysis (SA) refers to the task of associating a view polarity (usually, positive, negative, or neutral; or even fine-grained such as slightly angry, sad, etc.) to a given text, essentially breaking it down to a supervised (since we have the view labels apriori) classification task. Although heavily studied in resource-rich languages such as English thus pushing the SOTA by leaps and bounds, owing to the arrival of the Transformer architecture, the same cannot be said for resource-poor languages such as Bengali (BN). For a language spoken by roughly 300 million people, the technology enabling them to run trials on their favored tongue is severely lacking. In this paper, we analyze the SOTA for SA in Bengali, particularly, Transformer-based models. We discuss available datasets, their drawbacks, the nuances associated with Bengali i.e. what makes this a challenging language to apply SA on, and finally provide insights for future direction to mitigate the limitations in the field.

View on arXiv PDF

Similar