Automated Kantian Ethics: A Faithful Implementation
This addresses the need for philosophically grounded ethical AI in high-stakes domains like healthcare and policing, though it is an early step and incremental in building on prior automated ethics work.
The paper tackles the problem of AI lacking ethical reasoning tools by implementing an automated Kantian ethics system that formalizes Kant's categorical imperative in Dyadic Deontic Logic using the Isabelle theorem prover, resulting in a system capable of making nuanced judgments in complex dilemmas with explainable outputs.
As we grant artificial intelligence increasing power and independence in contexts like healthcare, policing, and driving, AI faces moral dilemmas but lacks the tools to solve them. Warnings from regulators, philosophers, and computer scientists about the dangers of unethical artificial intelligence have spurred interest in automated ethics-i.e., the development of machines that can perform ethical reasoning. However, prior work in automated ethics rarely engages with philosophical literature. Philosophers have spent centuries debating moral dilemmas so automated ethics will be most nuanced, consistent, and reliable when it draws on philosophical literature. In this paper, I present an implementation of automated Kantian ethics that is faithful to the Kantian philosophical tradition. I formalize Kant's categorical imperative in Dyadic Deontic Logic, implement this formalization in the Isabelle theorem prover, and develop a testing framework to evaluate how well my implementation coheres with expected properties of Kantian ethic. My system is an early step towards philosophically mature ethical AI agents and it can make nuanced judgements in complex ethical dilemmas because it is grounded in philosophical literature. Because I use an interactive theorem prover, my system's judgements are explainable.