CLJun 2

SEA-NLI: Natural Language Inference as a Lens into Southeast Asian Cultural Understanding

arXiv:2606.0328457.9
AI Analysis

For NLP researchers and practitioners, this benchmark exposes the cultural blind spots of LLMs in underrepresented Southeast Asian contexts, highlighting the need for culturally adapted models.

SEA-NLI is a culturally grounded NLI benchmark for Southeast Asia, revealing that frontier LLMs perform poorly on knowledge-intensive categories due to missing cultural knowledge; culture-aware prompting improves performance, while CoT offers limited gains.

Frontier LLMs perform well in Western contexts, but remain poorly tested on underrepresented cultures such as those in Southeast Asia (SEA). Existing NLI benchmarks are largely Western-centric, translation-derived, or monolingual, limiting their ability to measure culturally grounded reasoning. We introduce SEA-NLI, a native, culturally grounded NLI benchmark covering eight SEA countries in English and native regional languages, verified by native speakers. Across 17 encoder and decoder models, we observe a low performance from all models, especially for knowledge-intensive categories such as Languages and Science and Technology. Our analysis shows that failure cases mainly stem from missing SEA cultural knowledge: SEA-adapted models and culture-aware prompting improve performance, while CoT prompting offers limited gains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes