CLLGJun 15, 2021

Challenges and Considerations with Code-Mixed NLP for Multilingual Societies

arXiv:2106.07823v17 citations
Originality Synthesis-oriented
AI Analysis

It addresses NLP problems for multilingual societies, but is incremental as it reviews existing issues without presenting new experimental results.

The paper examines the challenges and limitations in NLP for code-mixed languages, focusing on applications like crisis management and healthcare, and proposes future datasets and models to advance research in this area.

Multilingualism refers to the high degree of proficiency in two or more languages in the written and oral communication modes. It often results in language mixing, a.k.a. code-mixing, when a multilingual speaker switches between multiple languages in a single utterance of a text or speech. This paper discusses the current state of the NLP research, limitations, and foreseeable pitfalls in addressing five real-world applications for social good crisis management, healthcare, political campaigning, fake news, and hate speech for multilingual societies. We also propose futuristic datasets, models, and tools that can significantly advance the current research in multilingual NLP applications for the societal good. As a representative example, we consider English-Hindi code-mixing but draw similar inferences for other language pairs

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes