Auditing the Ethical Logic of Generative AI Models
This addresses the need for robust ethical evaluation in high-stakes domains, offering a scalable methodology for benchmarking AI systems, though it is incremental in building on existing traditions.
The paper tackles the problem of evaluating the ethical reasoning of generative AI models by introducing a five-dimensional audit model, finding that while models converge on decisions, they vary in explanatory rigor and moral prioritization, with Chain-of-Thought prompting and reasoning-optimized models significantly enhancing performance.
As generative AI models become increasingly integrated into high-stakes domains, the need for robust methods to evaluate their ethical reasoning becomes increasingly important. This paper introduces a five-dimensional audit model -- assessing Analytic Quality, Breadth of Ethical Considerations, Depth of Explanation, Consistency, and Decisiveness -- to evaluate the ethical logic of leading large language models (LLMs). Drawing on traditions from applied ethics and higher-order thinking, we present a multi-battery prompt approach, including novel ethical dilemmas, to probe the models' reasoning across diverse contexts. We benchmark seven major LLMs finding that while models generally converge on ethical decisions, they vary in explanatory rigor and moral prioritization. Chain-of-Thought prompting and reasoning-optimized models significantly enhance performance on our audit metrics. This study introduces a scalable methodology for ethical benchmarking of AI systems and highlights the potential for AI to complement human moral reasoning in complex decision-making contexts.