AIJul 17, 2025

Black Box Deployed -- Functional Criteria for Artificial Moral Agents in the LLM Era

arXiv:2507.13175v21 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of ensuring ethical behavior in AI systems for society, but it is incremental as it builds on existing philosophical frameworks.

The paper tackles the problem of evaluating artificial moral agents (AMAs) in the era of opaque large language models (LLMs) by proposing a revised set of ten functional criteria, such as moral concordance and trustworthiness, to guide their alignment and societal integration.

The advancement of powerful yet opaque large language models (LLMs) necessitates a fundamental revision of the philosophical criteria used to evaluate artificial moral agents (AMAs). Pre-LLM frameworks often relied on the assumption of transparent architectures, which LLMs defy due to their stochastic outputs and opaque internal states. This paper argues that traditional ethical criteria are pragmatically obsolete for LLMs due to this mismatch. Engaging with core themes in the philosophy of technology, this paper proffers a revised set of ten functional criteria to evaluate LLM-based artificial moral agents: moral concordance, context sensitivity, normative integrity, metaethical awareness, system resilience, trustworthiness, corrigibility, partial transparency, functional autonomy, and moral imagination. These guideposts, applied to what we term "SMA-LLS" (Simulating Moral Agency through Large Language Systems), aim to steer AMAs toward greater alignment and beneficial societal integration in the coming years. We illustrate these criteria using hypothetical scenarios involving an autonomous public bus (APB) to demonstrate their practical applicability in morally salient contexts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes