Explainable Moral Values: a neuro-symbolic approach to value classification
This work addresses the need for interpretable value classification in AI, offering a domain-specific solution that is incremental by building on existing neuro-symbolic methods.
The paper tackles the problem of classifying moral values in sentences by integrating ontology-based reasoning with machine learning, achieving explainable classification comparable to complex models and outperforming baselines when combined with distributional semantics.
This work explores the integration of ontology-based reasoning and Machine Learning techniques for explainable value classification. By relying on an ontological formalization of moral values as in the Moral Foundations Theory, relying on the DnS Ontology Design Pattern, the \textit{sandra} neuro-symbolic reasoner is used to infer values (fomalized as descriptions) that are \emph{satisfied by} a certain sentence. Sentences, alongside their structured representation, are automatically generated using an open-source Large Language Model. The inferred descriptions are used to automatically detect the value associated with a sentence. We show that only relying on the reasoner's inference results in explainable classification comparable to other more complex approaches. We show that combining the reasoner's inferences with distributional semantics methods largely outperforms all the baselines, including complex models based on neural network architectures. Finally, we build a visualization tool to explore the potential of theory-based values classification, which is publicly available at http://xmv.geomeaning.com/.