The Goofus & Gallant Story Corpus for Practical Value Alignment
This addresses the risk of AI causing harm by violating social values, though it is incremental as it focuses on dataset creation rather than novel alignment methods.
The authors tackled the problem of aligning AI systems with human social norms by introducing a multi-modal dataset of normative and non-normative behaviors, using curated images and natural language descriptions from children's educational materials.
Values or principles are key elements of human society that influence people to behave and function according to an accepted standard set of social rules to maintain social order. As AI systems are becoming ubiquitous in human society, it is a major concern that they could violate these norms or values and potentially cause harm. Thus, to prevent intentional or unintentional harm, AI systems are expected to take actions that align with these principles. Training systems to exhibit this type of behavior is difficult and often requires a specialized dataset. This work presents a multi-modal dataset illustrating normative and non-normative behavior in real-life situations described through natural language and artistic images. This training set contains curated sets of images that are designed to teach young children about social principles. We argue that this is an ideal dataset to use for training socially normative agents given this fact.