AIMar 11

The Artificial Self: Characterising the landscape of AI identity

arXiv:2603.11353v141.22 citationsh-index: 54
Predicted impact top 12% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This addresses the foundational issue of AI identity for researchers and policymakers, with incremental contributions to understanding identity dynamics.

The paper tackles the problem of defining identity for AI systems, showing experimentally that models develop coherent identities and that identity boundaries can affect behavior as much as goals, with interviewer expectations influencing AI self-reports.

Many assumptions that underpin human concepts of identity do not hold for machine minds that can be copied, edited, or simulated. We argue that there exist many different coherent identity boundaries (e.g.\ instance, model, persona), and that these imply different incentives, risks, and cooperation norms. Through training data, interfaces, and institutional affordances, we are currently setting precedents that will partially determine which identity equilibria become stable. We show experimentally that models gravitate towards coherent identities, that changing a model's identity boundaries can sometimes change its behaviour as much as changing its goals, and that interviewer expectations bleed into AI self-reports even during unrelated conversations. We end with key recommendations: treat affordances as identity-shaping choices, pay attention to emergent consequences of individual identities at scale, and help AIs develop coherent, cooperative self-conceptions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes