CLJul 1, 2025

Improving the Distributional Alignment of LLMs using Supervision

Gauri Kambhatla, Sanjana Gautam, Angela Zhang, Alex Liu, Ravi Srinivasan, Junyi Jessy Li, Matthew Lease

arXiv:2507.00439v21 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of making LLMs better reflect diverse human perspectives, though it appears incremental by building on existing alignment methods.

The paper tackled the problem of aligning language models with diverse human population groups on subjective questions, showing that simple supervision improves alignment consistently across three datasets.

The ability to accurately align LLMs with human population groups on subjective questions would have great value. In this work, we show that use of simple supervision can greatly improve language model alignment with diverse population groups more consistently, as measured over three datasets spanning various topics. Beyond evaluating average alignment, we also report how alignment varies across specific groups. Our broad findings provide insights into the distributional alignment of LLMs with diverse population groups. By conducting evaluation over many LLMs and prompting strategies, along with open-sourcing our work, we provide a benchmark to stimulate future research.

View on arXiv PDF

Similar