From Algorithmic Black Boxes to Adaptive White Boxes: Declarative Decision-Theoretic Ethical Programs as Codes of Ethics
This addresses the challenge of aligning AI systems with human values for researchers and practitioners in AI ethics, though it appears incremental as it builds on existing concepts like declarative programming and decision theory.
The paper tackles the value alignment problem in AI by proposing declarative decision-theoretic ethical programs (DDTEP) to formalize codes of ethics, aiming to make machine ethical reasoning more transparent and accountable, with proof-of-concept examples in toy dilemmas and gatekeeping domains.
Ethics of algorithms is an emerging topic in various disciplines such as social science, law, and philosophy, but also artificial intelligence (AI). The value alignment problem expresses the challenge of (machine) learning values that are, in some way, aligned with human requirements or values. In this paper I argue for looking at how humans have formalized and communicated values, in professional codes of ethics, and for exploring declarative decision-theoretic ethical programs (DDTEP) to formalize codes of ethics. This renders machine ethical reasoning and decision-making, as well as learning, more transparent and hopefully more accountable. The paper includes proof-of-concept examples of known toy dilemmas and gatekeeping domains such as archives and libraries.