Debiasing Large Language Models in Thai Political Stance Detection via Counterfactual Calibration
This addresses fairness and reliability issues in low-resource, culturally complex settings for underrepresented languages like Thai, representing a domain-specific incremental improvement.
The paper tackles political stance detection in Thai language, where LLMs exhibit systematic biases like sentiment leakage and entity favoritism, by introducing ThaiFACTUAL, a lightweight calibration framework that reduces spurious correlations and improves fairness across multiple LLMs without fine-tuning.
Political stance detection in low-resource and culturally complex settings poses a critical challenge for large language models (LLMs). In the Thai political landscape - marked by indirect language, polarized figures, and entangled sentiment and stance - LLMs often display systematic biases such as sentiment leakage and favoritism toward entities. These biases undermine fairness and reliability. We present ThaiFACTUAL, a lightweight, model-agnostic calibration framework that mitigates political bias without requiring fine-tuning. ThaiFACTUAL uses counterfactual data augmentation and rationale-based supervision to disentangle sentiment from stance and reduce bias. We also release the first high-quality Thai political stance dataset, annotated with stance, sentiment, rationales, and bias markers across diverse entities and events. Experimental results show that ThaiFACTUAL significantly reduces spurious correlations, enhances zero-shot generalization, and improves fairness across multiple LLMs. This work highlights the importance of culturally grounded debiasing techniques for underrepresented languages.