SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment
This addresses the challenge of pluralistic alignment for diverse demographic groups, offering an incremental improvement over similarity-based retrieval methods.
The paper tackled the problem of aligning AI models to different groups' values by proposing SPICA, a framework for retrieving in-context examples that account for group-level differences, resulting in up to a +0.16 point improvement on a 5-point scale and more uniform benefits across groups.
When different groups' values differ, one approach to model alignment is to steer models at inference time towards each group's preferences. However, techniques like in-context learning only consider similarity when drawing few-shot examples and not cross-group differences in values. We propose SPICA, a framework that accounts for group-level differences during in-context example retrieval. SPICA introduces three designs: scenario banks, group-informed retrieval metrics, and in-context alignment prompts. From an evaluation of SPICA on an alignment task collecting inputs from four demographic groups ($n = 544$), our metrics retrieve in-context examples that more closely match observed preferences, with the best prompt configuration using multiple contrastive responses to demonstrate examples. In an end-to-end evaluation ($n = 120$), we observe that SPICA is higher rated than similarity-based retrieval, with groups seeing up to a +0.16 point improvement on a 5 point scale. Additionally, gains from SPICA were more uniform, with all groups benefiting from alignment rather than only some. Finally, we find that while a group-agnostic approach can align to aggregated values, it is not most suited for divergent groups.