MAAISep 30, 2024

From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks

arXiv:2410.00081v52 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work provides new empirical testing grounds for AI safety researchers, addressing a gap in existing benchmarks by incorporating biologically and economically motivated themes.

This paper introduces eight multi-objective, multi-agent gridworld benchmarks for AI safety, focusing on themes like homeostasis, diminishing returns, sustainability, and resource sharing. These environments are designed to illustrate pitfalls such as unbounded maximization, over-optimization, and resource depletion.

Developing safe, aligned agentic AI systems requires comprehensive empirical testing, yet many existing benchmarks neglect crucial themes aligned with biology and economics, both time-tested fundamental sciences describing our needs and preferences. To address this gap, the present work focuses on introducing biologically and economically motivated themes that have been neglected in current mainstream discussions on AI safety - namely a set of multi-objective, multi-agent alignment benchmarks that emphasize homeostasis for bounded and biological objectives, diminishing returns for unbounded, instrumental, and business objectives, sustainability principle, and resource sharing. Eight main benchmark environments have been implemented on the above themes, to illustrate key pitfalls and challenges in agentic AI-s, such as unboundedly maximizing a homeostatic objective, over-optimizing one objective at the expense of others, neglecting safety constraints, or depleting shared resources.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes