CLSep 30, 2024

Disentangling Singlish Discourse Particles with Task-Driven Representation

arXiv:2409.20366v48 citationsh-index: 3
AI Analysis

This work provides a computational method for understanding Singlish discourse particles, which is a foundational step for deeper comprehension and processing of the Singlish language for NLP researchers.

This paper addresses the challenge of understanding Singlish discourse particles (lah, meh, hor) by applying task-driven representation learning. The authors disentangle and cluster these particles to differentiate their pragmatic functions, and then use this understanding to improve Singlish-to-English machine translation.

Singlish, or formally Colloquial Singapore English, is an English-based creole language originating from the SouthEast Asian country Singapore. The language contains influences from Sinitic languages such as Chinese dialects, Malay, Tamil and so forth. A fundamental task to understanding Singlish is to first understand the pragmatic functions of its discourse particles, upon which Singlish relies heavily to convey meaning. This work offers a preliminary effort to disentangle the Singlish discourse particles (lah, meh and hor) with task-driven representation learning. After disentanglement, we cluster these discourse particles to differentiate their pragmatic functions, and perform Singlish-to-English machine translation. Our work provides a computational method to understanding Singlish discourse particles, and opens avenues towards a deeper comprehension of the language and its usage.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes