Morality is Non-Binary: Building a Pluralist Moral Sentence Embedding Space using Contrastive Learning
This work addresses the need for more nuanced moral representations in NLP, which is important for applications like ethical AI, but it is incremental as it builds on existing contrastive learning methods.
The paper tackles the problem of representing morality in NLP as binary, which fails to capture nuanced moral judgments, by building a pluralist moral sentence embedding space using contrastive learning, showing that this approach can capture moral pluralism but requires supervised human labels rather than self-supervision.
Recent advances in NLP show that language models retain a discernible level of knowledge in deontological ethics and moral norms. However, existing works often treat morality as binary, ranging from right to wrong. This simplistic view does not capture the nuances of moral judgment. Pluralist moral philosophers argue that human morality can be deconstructed into a finite number of elements, respecting individual differences in moral judgment. In line with this view, we build a pluralist moral sentence embedding space via a state-of-the-art contrastive learning approach. We systematically investigate the embedding space by studying the emergence of relationships among moral elements, both quantitatively and qualitatively. Our results show that a pluralist approach to morality can be captured in an embedding space. However, moral pluralism is challenging to deduce via self-supervision alone and requires a supervised approach with human labels.