CLCVNov 19, 2022

Machine Learning Approaches for Principle Prediction in Naturally Occurring Stories

arXiv:2212.06048v1h-index: 15
Originality Synthesis-oriented
AI Analysis

This work addresses value alignment for autonomous systems by moving beyond binary classifications, though it is incremental as it extends existing datasets and methods.

The paper tackled the problem of predicting normative principles from naturally occurring stories to improve value alignment in autonomous systems, showing that while individual principles can be classified, ambiguity in moral principles poses challenges for both humans and systems.

Value alignment is the task of creating autonomous systems whose values align with those of humans. Past work has shown that stories are a potentially rich source of information on human values; however, past work has been limited to considering values in a binary sense. In this work, we explore the use of machine learning models for the task of normative principle prediction on naturally occurring story data. To do this, we extend a dataset that has been previously used to train a binary normative classifier with annotations of moral principles. We then use this dataset to train a variety of machine learning models, evaluate these models and compare their results against humans who were asked to perform the same task. We show that while individual principles can be classified, the ambiguity of what "moral principles" represent, poses a challenge for both human participants and autonomous systems which are faced with the same task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes