CLAIAug 31, 2024

Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder Models

arXiv:2409.00358v212 citationsh-index: 7
AI Analysis

This work addresses dialect adaptation for decoder models, which is an incremental advancement over existing encoder-based methods.

The paper tackles the problem of adapting decoder models to dialects for target word prediction in game-playing conversations, showing that their LoRDD method outperforms baselines and reduces performance gaps with American English by up to 25% in accuracy.

Dialect adapters that improve the performance of LLMs for NLU tasks on certain sociolects/dialects/national varieties ('dialects' for the sake of brevity) have been reported for encoder models. In this paper, we extend the idea of dialect adapters to decoder models in our architecture called LoRDD. Using MD-3, a publicly available dataset of word game-playing conversations between dialectal speakers, our task is Target Word Prediction (TWP) from a masked conversation. LoRDD combines task adapters and dialect adapters where the latter employ contrastive learning on pseudo-parallel conversations from MD-3. Our experiments on Indian English and Nigerian English conversations with two models (Mistral and Gemma) demonstrate that LoRDD outperforms four baselines on TWP. Additionally, it significantly reduces the performance gap with American English, narrowing it to 12% and 5.8% for word similarity, and 25% and 4.5% for accuracy, respectively. The focused contribution of LoRDD is in its promise for dialect adaptation of decoder models using TWP, a simplified version of the commonly used next-word prediction task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes