Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication
This addresses the problem of model misalignment with population perspectives in NLP for researchers and practitioners, offering an incremental improvement by adjusting existing data rather than collecting new annotations.
The paper tackles the problem that NLP models trained on crowdsourced annotations may not reflect broader population views due to unrepresentative annotator pools, proposing PAIR (Population-Aligned Instance Replication) as a post-processing method to adjust training data without new annotations. The result shows that non-representative pools degrade model calibration while leaving accuracy unchanged, and PAIR corrects these calibration issues by replicating annotations from underrepresented groups to match population proportions.
Models trained on crowdsourced annotations may not reflect population views, if those who work as annotators do not represent the broader population. In this paper, we propose PAIR: Population-Aligned Instance Replication, a post-processing method that adjusts training data to better reflect target population characteristics without collecting additional annotations. Using simulation studies on offensive language and hate speech detection with varying annotator compositions, we show that non-representative pools degrade model calibration while leaving accuracy largely unchanged. PAIR corrects these calibration problems by replicating annotations from underrepresented annotator groups to match population proportions. We conclude with recommendations for improving the representativity of training data and model performance.