ValueSim: Generating Backstories to Model Individual Value Systems
This addresses the gap in simulating individualized human value systems for LLM alignment, which is incremental as it builds on existing techniques like prompt learning and reinforcement learning.
The paper tackles the problem of aligning Large Language Models with individualized human value systems by introducing ValueSim, a framework that generates personal backstories to simulate individual values, resulting in over 10% improvement in top-1 accuracy compared to retrieval-augmented generation methods on a benchmark derived from the World Values Survey.
As Large Language Models (LLMs) continue to exhibit increasingly human-like capabilities, aligning them with human values has become critically important. Contemporary advanced techniques, such as prompt learning and reinforcement learning, are being deployed to better align LLMs with human values. However, while these approaches address broad ethical considerations and helpfulness, they rarely focus on simulating individualized human value systems. To address this gap, we present ValueSim, a framework that simulates individual values through the generation of personal backstories reflecting past experiences and demographic information. ValueSim converts structured individual data into narrative backstories and employs a multi-module architecture inspired by the Cognitive-Affective Personality System to simulate individual values based on these narratives. Testing ValueSim on a self-constructed benchmark derived from the World Values Survey demonstrates an improvement in top-1 accuracy by over 10% compared to retrieval-augmented generation methods. Further analysis reveals that performance enhances as additional user interaction history becomes available, indicating the model's ability to refine its persona simulation capabilities over time.