The Anh Han

AI
h-index55
39papers
590citations
Novelty41%
AI Score53

39 Papers

CYJun 3
Prioritization of Risks from Artificial Intelligence: A Delphi Study of 272 International Experts

Alexander K. Saeri, Jess Graham, Michael Noetel et al.

Artificial intelligence poses many risks, ranging from familiar present-day harms to unprecedented and potentially catastrophic ones. Effective risk management requires prioritization: we must understand which risks are most severe, who is most vulnerable, and who is most responsible for addressing them. We report results from a three-round Delphi study conducted late 2025 with 272 international AI experts. Experts rated 24 AI risks on harm probability and severity, sector and actor vulnerability, actor responsibility, and overall concern. Experts estimated the five most severe harms in the next 5 years were likely to come from dangerous capabilities, competitive dynamics, weapons & cyberattacks (including CBRNE), power centralization, and false information. In a business-as-usual scenario, experts judged 18 of 24 risks as having a more than 10% probability of catastrophic outcomes (e.g., more than 1 million deaths or more than USD 100B in financial loss) in the next 5 years (2025-2030). In a scenario where pragmatic mitigations are implemented, experts still judged five risks as having a more than 10% probability of catastrophic outcomes: dangerous capabilities, weapons & cyberattacks, environmental harm, inequality & unemployment, and power centralization. All 24 risks were judged as being more than 5% likely to cause catastrophic outcomes. AI users and the general public were judged the most vulnerable to these risks, but experts assigned the highest responsibility for addressing them to general-purpose AI developers and governance actors (including governments, regulators, and standards bodies). Across most risks, experts identified information, finance, and national security as the most vulnerable sectors. These findings can guide AI risk prioritization and clarify expert expectations about who should bear responsibility for mitigation.

DSAug 9, 2024
Evolutionary mechanisms that promote cooperation may not promote social welfare

The Anh Han, Manh Hong Duong, Matjaz Perc

Understanding the emergence of prosocial behaviours among self-interested individuals is an important problem in many scientific disciplines. Various mechanisms have been proposed to explain the evolution of such behaviours, primarily seeking the conditions under which a given mechanism can induce highest levels of cooperation. As these mechanisms usually involve costs that alter individual payoffs, it is however possible that aiming for highest levels of cooperation might be detrimental for social welfare -- the later broadly defined as the total population payoff, taking into account all costs involved for inducing increased prosocial behaviours. Herein, by comparatively analysing the social welfare and cooperation levels obtained from stochastic evolutionary models of two well-established mechanisms of prosocial behaviour, namely, peer and institutional incentives, we demonstrate exactly that. We show that the objectives of maximising cooperation levels and the objectives of maximising social welfare are often misaligned. We argue for the need of adopting social welfare as the main optimisation objective when designing and implementing evolutionary mechanisms for social and collective goods.

GTMay 29
Social welfare optimisation under institutional reward and punishment

Van An Nguyen, Vuong Khang Huynh, Huu Loi Bui et al.

Institutional incentives are widely used to promote cooperation among autonomous, self-regarding agents, from human societies to multi-agent and AI systems. Existing work typically treats incentive design as a bi-objective problem: minimise institutional cost while achieving a high long-run frequency of cooperation. Whether such schemes also maximise social welfare - total population payoff net of institutional expenditure - has remained largely unexplored. We develop a welfare-centric framework for institutional incentives in finite, well-mixed populations playing a social dilemma (Donation Game and Public Goods Game), considering both rewards for cooperators and punishments for defectors. For each mechanism, we derive explicit expressions for expected social welfare and characterise how it depends on incentive efficiency and selection intensity. Analytically, we identify parameter regimes where social welfare has a single optimal incentive level and regimes with qualitative phase transitions, in which welfare becomes non-monotonic with multiple local optima. We prove that any welfare-maximising incentive is either zero or concentrated around a simple closed-form target, and we provide an efficient algorithm to compute these optima. Comparing reward and punishment, we further derive close-formed conditions under which reward outperform punishment in terms of social welfare for any given budget. Overall, our results reveal a systematic gap between incentives optimised for cost or cooperation frequency and those that maximise welfare.

MANov 18, 2022
Social Diversity Reduces the Complexity and Cost of Fostering Fairness

Theodor Cimpeanu, Alessandro Di Stefano, Cedric Perret et al.

Institutions and investors are constantly faced with the challenge of appropriately distributing endowments. No budget is limitless and optimising overall spending without sacrificing positive outcomes has been approached and resolved using several heuristics. To date, prior works have failed to consider how to encourage fairness in a population where social diversity is ubiquitous, and in which investors can only partially observe the population. Herein, by incorporating social diversity in the Ultimatum game through heterogeneous graphs, we investigate the effects of several interference mechanisms which assume incomplete information and flexible standards of fairness. We quantify the role of diversity and show how it reduces the need for information gathering, allowing us to relax a strict, costly interference process. Furthermore, we find that the influence of certain individuals, expressed by different network centrality measures, can be exploited to further reduce spending if minimal fairness requirements are lowered. Our results indicate that diversity changes and opens up novel mechanisms available to institutions wishing to promote fairness. Overall, our analysis provides novel insights to guide institutional policies in socially diverse complex systems.

MAFeb 20, 2023
The evolutionary advantage of guilt: co-evolution of social and non-social guilt in structured populations

Theodor Cimpeanu, Luis Moniz Pereira, The Anh Han

Building ethical machines may involve bestowing upon them the emotional capacity to self-evaluate and repent on their actions. While apologies represent potential strategic interactions, the explicit evolution of guilt as a behavioural trait remains poorly understood. Our study delves into the co-evolution of two forms of emotional guilt: social guilt entails a cost, requiring agents to exert efforts to understand others' internal states and behaviours; and non-social guilt, which only involves awareness of one's own state, incurs no social cost. Resorting to methods from evolutionary game theory, we study analytically, and through extensive numerical and agent-based simulations, whether and how guilt can evolve and deploy, depending on the underlying structure of the systems of agents. Our findings reveal that in lattice and scale-free networks, strategies favouring emotional guilt dominate a broader range of guilt and social costs compared to non-structured well-mixed populations, so leading to higher levels of cooperation. In structured populations, both social and non-social guilt can thrive through clustering with emotionally inclined strategies, thereby providing protection against exploiters, particularly for less costly non-social strategies. These insights shed light on the complex interplay of guilt and cooperation, enhancing our understanding of ethical artificial intelligence.

SOC-PHMar 4
Social physics in the age of artificial intelligence

The Anh Han, Joel Z. Leibo, Tom Lenaerts et al.

Artificial intelligence (AI) systems are rapidly becoming more capable, autonomous, and deeply embedded in social life. As humans increasingly interact, cooperate, and compete with AI, we move from purely human societies to hybrid human-AI societies whose collective dynamics cannot be captured by existing behavioural models alone. Drawing on evolutionary game theory, cultural evolution, and Large Language Models (LLMs) powered simulations, we argue that these developments open a new research agenda for social physics centred on the co-evolution of humans and machines. We outline six key research directions. First, modelling the evolutionary dynamics of social behaviours (e.g. cooperation, fairness, trust) in hybrid human-AI populations. Second, understanding machine culture: how AI systems generate, mediate, and select cultural traits. Third, analysing the co-evolution of language and behaviour when LLMs frame and participate in decisions. Fourth, studying the evolution of AI delegation: how responsibilities and control are negotiated between humans and machines. Fifth, formalising and comparing the distinct epistemic pipelines that generate human and AI behaviour. Sixth, modelling the co-evolution of AI development and regulation in a strategic ecosystem of firms, users, and institutions. Together, these directions define a programme for using social physics to anticipate and steer the societal impact of advanced AI.

AIMar 6, 2023
Both eyes open: Vigilant Incentives help Regulatory Markets improve AI Safety

Paolo Bova, Alessandro Di Stefano, The Anh Han

In the context of rapid discoveries by leaders in AI, governments must consider how to design regulation that matches the increasing pace of new AI capabilities. Regulatory Markets for AI is a proposal designed with adaptability in mind. It involves governments setting outcome-based targets for AI companies to achieve, which they can show by purchasing services from a market of private regulators. We use an evolutionary game theory model to explore the role governments can play in building a Regulatory Market for AI systems that deters reckless behaviour. We warn that it is alarmingly easy to stumble on incentives which would prevent Regulatory Markets from achieving this goal. These 'Bounty Incentives' only reward private regulators for catching unsafe behaviour. We argue that AI companies will likely learn to tailor their behaviour to how much effort regulators invest, discouraging regulators from innovating. Instead, we recommend that governments always reward regulators, except when they find that those regulators failed to detect unsafe behaviour that they should have. These 'Vigilant Incentives' could encourage private regulators to find innovative ways to evaluate cutting-edge AI systems.

AIMay 15, 2022
Understanding Emergent Behaviours in Multi-Agent Systems with Evolutionary Game Theory

The Anh Han

The mechanisms of emergence and evolution of collective behaviours in dynamical Multi-Agent Systems (MAS) of multiple interacting agents, with diverse behavioral strategies in co-presence, have been undergoing mathematical study via Evolutionary Game Theory (EGT). Their systematic study also resorts to agent-based modelling and simulation (ABM) techniques, thus enabling the study of aforesaid mechanisms under a variety of conditions, parameters, and alternative virtual games. This paper summarises some main research directions and challenges tackled in our group, using methods from EGT and ABM. These range from the introduction of cognitive and emotional mechanisms into agents' implementation in an evolving MAS, to the cost-efficient interference for promoting prosocial behaviours in complex networks, to the regulation and governance of AI safety development ecology, and to the equilibrium analysis of random evolutionary multi-player games. This brief aims to sensitize the reader to EGT based issues, results and prospects, which are accruing in importance for the modeling of minds with machines and the engineering of prosocial behaviours in dynamical MAS, with impact on our understanding of the emergence and stability of collective behaviours. In all cases, important open problems in MAS research as viewed or prioritised by the group are described.

AIJan 27
More at Stake: How Payoff and Language Shape LLM Agent Strategies in Cooperation Dilemmas

Trung-Kiet Huynh, Dao-Sy Duy-Minh, Thanh-Bang Cao et al.

As LLMs increasingly act as autonomous agents in interactive and multi-agent settings, understanding their strategic behavior is critical for safety, coordination, and AI-driven social and economic systems. We investigate how payoff magnitude and linguistic context shape LLM strategies in repeated social dilemmas, using a payoff-scaled Prisoner's Dilemma to isolate sensitivity to incentive strength. Across models and languages, we observe consistent behavioral patterns, including incentive-sensitive conditional strategies and cross-linguistic divergence. To interpret these dynamics, we train supervised classifiers on canonical repeated-game strategies and apply them to LLM decisions, revealing systematic, model- and language-dependent behavioral intentions, with linguistic framing sometimes matching or exceeding architectural effects. Our results provide a unified framework for auditing LLMs as strategic agents and highlight cooperation biases with direct implications for AI governance and multi-agent system design.

MADec 8, 2025
Understanding LLM Agent Behaviours via Game Theory: Strategy Recognition, Biases and Multi-Agent Dynamics

Trung-Kiet Huynh, Duy-Minh Dao-Sy, Thanh-Bang Cao et al.

As Large Language Models (LLMs) increasingly operate as autonomous decision-makers in interactive and multi-agent systems and human societies, understanding their strategic behaviour has profound implications for safety, coordination, and the design of AI-driven social and economic infrastructures. Assessing such behaviour requires methods that capture not only what LLMs output, but the underlying intentions that guide their decisions. In this work, we extend the FAIRGAME framework to systematically evaluate LLM behaviour in repeated social dilemmas through two complementary advances: a payoff-scaled Prisoners Dilemma isolating sensitivity to incentive magnitude, and an integrated multi-agent Public Goods Game with dynamic payoffs and multi-agent histories. These environments reveal consistent behavioural signatures across models and languages, including incentive-sensitive cooperation, cross-linguistic divergence and end-game alignment toward defection. To interpret these patterns, we train traditional supervised classification models on canonical repeated-game strategies and apply them to FAIRGAME trajectories, showing that LLMs exhibit systematic, model- and language-dependent behavioural intentions, with linguistic framing at times exerting effects as strong as architectural differences. Together, these findings provide a unified methodological foundation for auditing LLMs as strategic agents and reveal systematic cooperation biases with direct implications for AI governance, collective decision-making, and the design of safe multi-agent systems.

SOC-PHDec 8, 2025
Social welfare optimisation in well-mixed and structured populations

Van An Nguyen, Vuong Khang Huynh, Ho Nam Duong et al.

Research on promoting cooperation among autonomous, self-regarding agents has often focused on the bi-objective optimisation problem: minimising the total incentive cost while maximising the frequency of cooperation. However, the optimal value of social welfare under such constraints remains largely unexplored. In this work, we hypothesise that achieving maximal social welfare is not guaranteed at the minimal incentive cost required to drive agents to a desired cooperative state. To address this gap, we adopt to a single-objective approach focused on maximising social welfare, building upon foundational evolutionary game theory models that examined cost efficiency in finite populations, in both well-mixed and structured population settings. Our analytical model and agent-based simulations show how different interference strategies, including rewarding local versus global behavioural patterns, affect social welfare and dynamics of cooperation. Our results reveal a significant gap in the per-individual incentive cost between optimising for pure cost efficiency or cooperation frequency and optimising for maximal social welfare. Overall, our findings indicate that incentive design, policy, and benchmarking in multi-agent systems and human societies should prioritise welfare-centric objectives over proxy targets of cost or cooperation frequency.

MAJan 7
When Numbers Start Talking: Implicit Numerical Coordination Among LLM-Based Agents

Alessio Buscemi, Daniele Proverbio, Alessandro Di Stefano et al.

LLMs-based agents increasingly operate in multi-agent environments where strategic interaction and coordination are required. While existing work has largely focused on individual agents or on interacting agents sharing explicit communication, less is known about how interacting agents coordinate implicitly. In particular, agents may engage in covert communication, relying on indirect or non-linguistic signals embedded in their actions rather than on explicit messages. This paper presents a game-theoretic study of covert communication in LLM-driven multi-agent systems. We analyse interactions across four canonical game-theoretic settings under different communication regimes, including explicit, restricted, and absent communication. Considering heterogeneous agent personalities and both one-shot and repeated games, we characterise when covert signals emerge and how they shape coordination and strategic outcomes.

MANov 24, 2023
Evolutionary game theory: the mathematics of evolution and collective behaviours

The Anh Han

This brief discusses evolutionary game theory as a powerful and unified mathematical tool to study evolution of collective behaviours. It summarises some of my recent research directions using evolutionary game theory methods, which include i) the analysis of statistical properties of the number of (stable) equilibria in a random evolutionary game, and ii) the modelling of safety behaviours' evolution and the risk posed by advanced Artificial Intelligence technologies in a technology development race. Finally, it includes an outlook and some suggestions for future researchers.

MAJun 30, 2023
Discriminatory or Samaritan -- which AI is needed for humanity? An Evolutionary Game Theory Analysis of Hybrid Human-AI populations

Tim Booker, Manuel Miranda, Jesús A. Moreno López et al.

As artificial intelligence (AI) systems are increasingly embedded in our lives, their presence leads to interactions that shape our behaviour, decision-making, and social interactions. Existing theoretical research has primarily focused on human-to-human interactions, overlooking the unique dynamics triggered by the presence of AI. In this paper, resorting to methods from evolutionary game theory, we study how different forms of AI influence the evolution of cooperation in a human population playing the one-shot Prisoner's Dilemma game in both well-mixed and structured populations. We found that Samaritan AI agents that help everyone unconditionally, including defectors, can promote higher levels of cooperation in humans than Discriminatory AI that only help those considered worthy/cooperative, especially in slow-moving societies where change is viewed with caution or resistance (small intensities of selection). Intuitively, in fast-moving societies (high intensities of selection), Discriminatory AIs promote higher levels of cooperation than Samaritan AIs.

CYApr 23
Mathematical Modelling of Ethical AI Use in Higher Education: A Coordination Game Framework for Future-Facing Learning

Ndidi Bianca Ogbo, Zhao Song, Shatha Ghareeb et al.

The rapid uptake of generative artificial intelligence (AI) in higher education is reshaping assessment practices and intensifying concerns around academic integrity, fairness, and learning quality. While institutional responses increasingly emphasise policy guidance and ethical principles, there remains limited formal understanding of how collective norms of responsible or opportunistic AI use emerge and stabilise within student cohorts. This paper reframes student AI use in assessment as a coordination problem shaped by peer expectations and assessment design rather than individual compliance alone. We develop a coordination-based evolutionary game-theoretic framework that captures learning value, effort, perceived fairness, and transparency, with institutional AI governance modelled implicitly through reflective assessment incentives. We use analytical results and finite-population simulations to reveal threshold-driven behavioural transitions in student AI use: small, well-calibrated changes in reflective assessment incentives can trigger rapid shifts towards responsible, learning-oriented AI-use norms, whereas weak or misaligned incentives allow opportunistic practices to persist. These non-linear dynamics explain why policy statements alone often fail to change behaviour, while modest assessment redesigns can have disproportionate effects. By providing a mechanism-level account of how assessment structures shape collective AI-use practices, this work offers higher education institutions an analytically grounded tool for Future Facing Learning, supporting proportionate, pedagogy-led AI governance without reliance on surveillance or punitive enforcement.

GTMar 17
Disentangling trust from cooperation: Evolution of trust as reduced monitoring in social dilemmas

Cedric Perret, The Anh Han, Elias Fernández Domingos et al.

It is commonly assumed that trust increases cooperation. However, game-theoretic models often fail to distinguish between cooperative actions and trust, making it difficult to independently measure trust and determine how its effects vary in different social dilemmas. To address this, we build on influential theories that equate trust with reduced monitoring of an agent's actions. We implement this as a heuristic that cognitively bounded agents can use in repeated games to avoid spending time and effort always monitoring their partner. Agents using this heuristic reduce monitoring of a partner's actions once a threshold level of cooperativeness has been observed -- providing a measurable and architecture-agnostic definition of trust. Using evolutionary game theory, we systematically analyse the success of strategies that use this trust heuristic across the entire space of two-player symmetric social dilemma games. We demonstrate that trust-as-reduced-monitoring facilitates cooperation in two different ways. First, when monitoring is costly, trust heuristics allow for higher levels of cooperation in social dilemmas where the temptation to defect is high. Second, when agents can make action errors, trust heuristics promote cooperation even in coordination problems. Our results disentangle trust from cooperation, and provide a behavioural measure of trust that applies across interaction types.

AIMar 25
Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour

Adeela Bashir, Zhao Song, Ndidi Bianca Ogbo et al.

AI safety is an increasingly urgent concern as the capabilities and adoption of AI systems grow. Existing evolutionary models of AI governance have primarily examined incentives for safe development and effective regulation, typically representing users' trust as a one-shot adoption choice rather than as a dynamic, evolving process shaped by repeated interactions. We instead model trust as reduced monitoring in a repeated, asymmetric interaction between users and AI developers, where checking AI behaviour is costly. Using evolutionary game theory, we study how user trust strategies and developer choices between safe (compliant) and unsafe (non-compliant) AI co-evolve under different levels of monitoring cost and institutional regimes. We complement the infinite-population replicator analysis with stochastic finite-population dynamics and reinforcement learning (Q-learning) simulations. Across these approaches, we find three robust long-run regimes: no adoption with unsafe development, unsafe but widely adopted systems, and safe systems that are widely adopted. Only the last is desirable, and it arises when penalties for unsafe behaviour exceed the extra cost of safety and users can still afford to monitor at least occasionally. Our results formally support governance proposals that emphasise transparency, low-cost monitoring, and meaningful sanctions, and they show that neither regulation alone nor blind user trust is sufficient to prevent evolutionary drift towards unsafe or low-adoption outcomes.

SOC-PHApr 22, 2023
We both think you did wrong -- How agreement shapes and is shaped by indirect reciprocity

Marcus Krellner, The Anh Han

Humans judge each other's actions, which at least partly functions to detect and deter cheating and to enable helpfulness in an indirect reciprocity fashion. However, most forms of judging do not only concern the action itself, but also the moral status of the receiving individual (to deter cheating it must be morally acceptable to withhold help from cheaters). This is a problem, when not everybody agrees who is good and who is bad. Although it has been widely acknowledged that disagreement may exist and that it can be detrimental for indirect reciprocity, the details of this crucial feature of moral judgments have never been studied in depth. We show, that even when everybody assesses individually (aka privately), some moral judgement systems (aka norms) can lead to high levels of agreement. We give a detailed account of the mechanisms which cause it and we show how to predict agreement analytically without requiring agent-based simulations, and for any observation rate. Finally, we show that agreement may increase or decrease reputations and therefore how much helpfulness (aka cooperation) occurs.

CLMay 11
Training-Free Cultural Alignment of Large Language Models via Persona Disagreement

Huynh Trung Kiet, Dao Sy Duy Minh, Tuan Nguyen et al.

Large language models increasingly mediate decisions that turn on moral judgement, yet a growing body of evidence shows that their implicit preferences are not culturally neutral. Existing cultural alignment methods either require per-country preference data and fine-tuning budgets or assume white-box access to model internals that commercial APIs do not expose. In this work, we focus on this realistic black-box, public-data-only regime and observe that within-country sociodemographic disagreement, not consensus, is the primary steering signal. We introduce DISCA (Disagreement-Informed Steering for Cultural Alignment), an inference-time method that instantiates each country as a panel of World-Values-Survey-grounded persona agents and converts their disagreement into a bounded, loss-averse logit correction. Across 20 countries and 7 open-weight backbones (2B--70B), DISCA reduces cultural misalignment on MultiTP by 10--24% on the six backbones >=3.8B, and 2--7% on open-ended scenarios, without changing any weights. Our results suggest that inference-time calibration is a scalable alternative to fine-tuning for serving the long tail of global moral preferences.

AIMay 10
Strategic commitments shape collective cybersecurity under AI inequality

Adeela Bashir, Zia Ush Shamszaman, Zhao Song et al.

The growing integration of AI into cybersecurity is reshaping the balance between attackers and defenders. When access to advanced AI-enabled defence tools is uneven, resource-limited defenders may be unable to adopt effective protection, creating persistent system vulnerabilities. We study the impact of differential AI access using an evolutionary game-theoretic model in a finite population. We first show that when high-capability defence is costly, the population is driven toward low-cost, weak-defence behaviour, sustaining attacks and weakening long-run security. To address this problem, we introduce differential access to AI defence tools by allowing defenders to choose between low- and high-capability protection based on their resources. We then examine the role of a small group of committed defenders who always adopt strong defence and influence others through social learning. Although commitment increases the prevalence of strong defence, it alone cannot stabilise secure outcomes due to high defence costs. We therefore incorporate a targeted subsidy to remove the cost disadvantage from committed defenders. Our analysis shows that subsidised commitment significantly increases strong defence adoption, suppresses successful attacks, and improves overall system resilience. Simulations across a broad parameter space confirm that subsidies consistently outperform commitment alone. In addition, social-welfare analysis shows improved defender outcomes while keeping attacker gains low. These findings suggest that targeted support for key defenders can be an effective mechanism for stabilising cybersecurity in AI-driven environments and provide a theoretical bridge between cybersecurity policy, AI governance, and strategic allocation of defensive AI capabilities.

MAFeb 19, 2025
Multi-Agent Risks from Advanced AI

Lewis Hammond, Alan Chan, Jesse Clifton et al. · stanford

The rapid development of advanced AI agents and the imminent deployment of many instances of these agents will give rise to multi-agent systems of unprecedented complexity. These systems pose novel and under-explored risks. In this report, we provide a structured taxonomy of these risks by identifying three key failure modes (miscoordination, conflict, and collusion) based on agents' incentives, as well as seven key risk factors (information asymmetries, network effects, selection pressures, destabilising dynamics, commitment problems, emergent agency, and multi-agent security) that can underpin them. We highlight several important instances of each risk, as well as promising directions to help mitigate them. By anchoring our analysis in a range of real-world examples and experimental evidence, we illustrate the distinct challenges posed by multi-agent systems and their implications for the safety, governance, and ethics of advanced AI.

AIMar 12, 2025
Media and responsible AI governance: a game-theoretic and LLM analysis

Nataliya Balabanova, Adeela Bashir, Paolo Bova et al.

This paper investigates the complex interplay between AI developers, regulators, users, and the media in fostering trustworthy AI systems. Using evolutionary game theory and large language models (LLMs), we model the strategic interactions among these actors under different regulatory regimes. The research explores two key mechanisms for achieving responsible governance, safe AI development and adoption of safe AI: incentivising effective regulation through media reporting, and conditioning user trust on commentariats' recommendation. The findings highlight the crucial role of the media in providing information to users, potentially acting as a form of "soft" regulation by investigating developers or regulators, as a substitute to institutional AI regulation (which is still absent in many regions). Both game-theoretic analysis and LLM-based simulations reveal conditions under which effective regulation and trustworthy AI development emerge, emphasising the importance of considering the influence of different regulatory regimes from an evolutionary game-theoretic perspective. The study concludes that effective governance requires managing incentives and costs for high quality commentaries.

AIApr 19, 2025
FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory

Alessio Buscemi, Daniele Proverbio, Alessandro Di Stefano et al.

Letting AI agents interact in multi-agent applications adds a layer of complexity to the interpretability and prediction of AI outcomes, with profound implications for their trustworthy adoption in research and society. Game theory offers powerful models to capture and interpret strategic interaction among agents, but requires the support of reproducible, standardized and user-friendly IT frameworks to enable comparison and interpretation of results. To this end, we present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory. We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents, depending on the employed Large Language Model (LLM) and used language, as well as on the personality trait or strategic knowledge of the agents. Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios and compare the results across simulation campaigns and with game-theoretic predictions, enabling the systematic discovery of biases, the anticipation of emerging behavior out of strategic interplays, and empowering further research into strategic decision-making using LLM agents.

AIApr 11, 2025
Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents

Alessio Buscemi, Daniele Proverbio, Paolo Bova et al.

There is general agreement that fostering trust and cooperation within the AI development ecosystem is essential to promote the adoption of trustworthy AI systems. By embedding Large Language Model (LLM) agents within an evolutionary game-theoretic framework, this paper investigates the complex interplay between AI developers, regulators and users, modelling their strategic choices under different regulatory scenarios. Evolutionary game theory (EGT) is used to quantitatively model the dilemmas faced by each actor, and LLMs provide additional degrees of complexity and nuances and enable repeated games and incorporation of personality traits. Our research identifies emerging behaviours of strategic AI agents, which tend to adopt more "pessimistic" (not trusting and defective) stances than pure game-theoretic agents. We observe that, in case of full trust by users, incentives are effective to promote effective regulation; however, conditional trust may deteriorate the "social pact". Establishing a virtuous feedback between users' trust and regulators' reputation thus appears to be key to nudge developers towards creating safe AI. However, the level at which this trust emerges may depend on the specific LLM used for testing. Our results thus provide guidance for AI regulation systems, and help predict the outcome of strategic LLM agents, should they be used to aid regulation itself.

CVJan 14, 2025
Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition

Md Meem Hossain, The Anh Han, Safina Showkat Ara et al.

Human Activity Recognition (HAR) has gained significant importance with the growing use of sensor-equipped devices and large datasets. This paper evaluates the performance of three categories of models : classical machine learning, deep learning architectures, and Restricted Boltzmann Machines (RBMs) using five key benchmark datasets of HAR (UCI-HAR, OPPORTUNITY, PAMAP2, WISDM, and Berkeley MHAD). We assess various models, including Decision Trees, Random Forests, Convolutional Neural Networks (CNN), and Deep Belief Networks (DBNs), using metrics such as accuracy, precision, recall, and F1-score for a comprehensive comparison. The results show that CNN models offer superior performance across all datasets, especially on the Berkeley MHAD. Classical models like Random Forest do well on smaller datasets but face challenges with larger, more complex data. RBM-based models also show notable potential, particularly for feature learning. This paper offers a detailed comparison to help researchers choose the most suitable model for HAR tasks.

AIDec 23, 2024
Enhancing Cancer Diagnosis with Explainable & Trustworthy Deep Learning Models

Badaru I. Olumuyiwa, The Anh Han, Zia U. Shamszaman

This research presents an innovative approach to cancer diagnosis and prediction using explainable Artificial Intelligence (XAI) and deep learning techniques. With cancer causing nearly 10 million deaths globally in 2020, early and accurate diagnosis is crucial. Traditional methods often face challenges in cost, accuracy, and efficiency. Our study develops an AI model that provides precise outcomes and clear insights into its decision-making process, addressing the "black box" problem of deep learning models. By employing XAI techniques, we enhance interpretability and transparency, building trust among healthcare professionals and patients. Our approach leverages neural networks to analyse extensive datasets, identifying patterns for cancer detection. This model has the potential to revolutionise diagnosis by improving accuracy, accessibility, and clarity in medical decision-making, possibly leading to earlier detection and more personalised treatment strategies. Furthermore, it could democratise access to high-quality diagnostics, particularly in resource-limited settings, contributing to global health equity. The model's applications extend beyond cancer diagnosis, potentially transforming various aspects of medical decision-making and saving millions of lives worldwide.

AIDec 19, 2024
Quantifying detection rates for dangerous capabilities: a theoretical model of dangerous capability evaluations

Paolo Bova, Alessandro Di Stefano, The Anh Han

We present a quantitative model for tracking dangerous AI capabilities over time. Our goal is to help the policy and research community visualise how dangerous capability testing can give us an early warning about approaching AI risks. We first use the model to provide a novel introduction to dangerous capability testing and how this testing can directly inform policy. Decision makers in AI labs and government often set policy that is sensitive to the estimated danger of AI systems, and may wish to set policies that condition on the crossing of a set threshold for danger. The model helps us to reason about these policy choices. We then run simulations to illustrate how we might fail to test for dangerous capabilities. To summarise, failures in dangerous capability testing may manifest in two ways: higher bias in our estimates of AI danger, or larger lags in threshold monitoring. We highlight two drivers of these failure modes: uncertainty around dynamics in AI capabilities and competition between frontier AI labs. Effective AI policy demands that we address these failure modes and their drivers. Even if the optimal targeting of resources is challenging, we show how delays in testing can harm AI policy. We offer preliminary recommendations for building an effective testing ecosystem for dangerous capabilities and advise on a research agenda.

AIMay 7, 2025
KERAIA: An Adaptive and Explainable Framework for Dynamic Knowledge Representation and Reasoning

Stephen Richard Varey, Alessandro Di Stefano, The Anh Han

In this paper, we introduce KERAIA, a novel framework and software platform for symbolic knowledge engineering designed to address the persistent challenges of representing, reasoning with, and executing knowledge in dynamic, complex, and context-sensitive environments. The central research question that motivates this work is: How can unstructured, often tacit, human expertise be effectively transformed into computationally tractable algorithms that AI systems can efficiently utilise? KERAIA seeks to bridge this gap by building on foundational concepts such as Minsky's frame-based reasoning and K-lines, while introducing significant innovations. These include Clouds of Knowledge for dynamic aggregation, Dynamic Relations (DRels) for context-sensitive inheritance, explicit Lines of Thought (LoTs) for traceable reasoning, and Cloud Elaboration for adaptive knowledge transformation. This approach moves beyond the limitations of traditional, often static, knowledge representation paradigms. KERAIA is designed with Explainable AI (XAI) as a core principle, ensuring transparency and interpretability, particularly through the use of LoTs. The paper details the framework's architecture, the KSYNTH representation language, and the General Purpose Paradigm Builder (GPPB) to integrate diverse inference methods within a unified structure. We validate KERAIA's versatility, expressiveness, and practical applicability through detailed analysis of multiple case studies spanning naval warfare simulation, industrial diagnostics in water treatment plants, and strategic decision-making in the game of RISK. Furthermore, we provide a comparative analysis against established knowledge representation paradigms (including ontologies, rule-based systems, and knowledge graphs) and discuss the implementation aspects and computational considerations of the KERAIA platform.

CLFeb 15
A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing

Naeimeh Nourmohammadi, Md Meem Hossain, The Anh Han et al.

Large language models (LLMs) show promise for healthcare question answering, but clinical use is limited by weak verification, insufficient evidence grounding, and unreliable confidence signalling. We propose a multi-agent medical QA framework that combines complementary LLMs with evidence retrieval, uncertainty estimation, and bias checks to improve answer reliability. Our approach has two phases. First, we fine-tune three representative LLM families (GPT, LLaMA, and DeepSeek R1) on MedQuAD-derived medical QA data (20k+ question-answer pairs across multiple NIH domains) and benchmark generation quality. DeepSeek R1 achieves the strongest scores (ROUGE-1 0.536 +- 0.04; ROUGE-2 0.226 +-0.03; BLEU 0.098 -+ 0.018) and substantially outperforms the specialised biomedical baseline BioGPT in zero-shot evaluation. Second, we implement a modular multi-agent pipeline in which a Clinical Reasoning agent (fine-tuned LLaMA) produces structured explanations, an Evidence Retrieval agent queries PubMed to ground responses in recent literature, and a Refinement agent (DeepSeek R1) improves clarity and factual consistency; an optional human validation path is triggered for high-risk or high-uncertainty cases. Safety mechanisms include Monte Carlo dropout and perplexity-based uncertainty scoring, plus lexical and sentiment-based bias detection supported by LIME/SHAP-based analyses. In evaluation, the full system achieves 87% accuracy with relevance around 0.80, and evidence augmentation reduces uncertainty (perplexity 4.13) compared to base responses, with mean end-to-end latency of 36.5 seconds under the reported configuration. Overall, the results indicate that agent specialisation and verification layers can mitigate key single-model limitations and provide a practical, extensible design for evidence-based and bias-aware medical AI.

QMSep 3, 2025
Predicting Antimicrobial Resistance (AMR) in Campylobacter, a Foodborne Pathogen, and Cost Burden Analysis Using Machine Learning

Shubham Mishra, The Anh Han, Bruno Silvester Lopes et al.

Antimicrobial resistance (AMR) poses a significant public health and economic challenge, increasing treatment costs and reducing antibiotic effectiveness. This study employs machine learning to analyze genomic and epidemiological data from the public databases for molecular typing and microbial genome diversity (PubMLST), incorporating data from UK government-supported AMR surveillance by the Food Standards Agency and Food Standards Scotland. We identify AMR patterns in Campylobacter jejuni and Campylobacter coli isolates collected in the UK from 2001 to 2017. The research integrates whole-genome sequencing (WGS) data, epidemiological metadata, and economic projections to identify key resistance determinants and forecast future resistance trends and healthcare costs. We investigate gyrA mutations for fluoroquinolone resistance and the tet(O) gene for tetracycline resistance, training a Random Forest model validated with bootstrap resampling (1,000 samples, 95% confidence intervals), achieving 74% accuracy in predicting AMR phenotypes. Time-series forecasting models (SARIMA, SIR, and Prophet) predict a rise in campylobacteriosis cases, potentially exceeding 130 cases per 100,000 people by 2050, with an economic burden projected to surpass 1.9 billion GBP annually if left unchecked. An enhanced Random Forest system, analyzing 6,683 isolates, refines predictions by incorporating temporal patterns, uncertainty estimation, and resistance trend modeling, indicating sustained high beta-lactam resistance, increasing fluoroquinolone resistance, and fluctuating tetracycline resistance.

AISep 2, 2025
Can Media Act as a Soft Regulator of Safe AI Development? A Game Theoretical Analysis

Henrique Correia da Fonseca, António Fernandes, Zhao Song et al.

When developers of artificial intelligence (AI) products need to decide between profit and safety for the users, they likely choose profit. Untrustworthy AI technology must come packaged with tangible negative consequences. Here, we envisage those consequences as the loss of reputation caused by media coverage of their misdeeds, disseminated to the public. We explore whether media coverage has the potential to push AI creators into the production of safe products, enabling widespread adoption of AI technology. We created artificial populations of self-interested creators and users and studied them through the lens of evolutionary game theory. Our results reveal that media is indeed able to foster cooperation between creators and users, but not always. Cooperation does not evolve if the quality of the information provided by the media is not reliable enough, or if the costs of either accessing media or ensuring safety are too high. By shaping public perception and holding developers accountable, media emerges as a powerful soft regulator -- guiding AI safety even in the absence of formal government oversight.

CRAug 4, 2025
Can LLMs effectively provide game-theoretic-based scenarios for cybersecurity?

Daniele Proverbio, Alessio Buscemi, Alessandro Di Stefano et al.

Game theory has long served as a foundational tool in cybersecurity to test, predict, and design strategic interactions between attackers and defenders. The recent advent of Large Language Models (LLMs) offers new tools and challenges for the security of computer systems; In this work, we investigate whether classical game-theoretic frameworks can effectively capture the behaviours of LLM-driven actors and bots. Using a reproducible framework for game-theoretic LLM agents, we investigate two canonical scenarios -- the one-shot zero-sum game and the dynamic Prisoner's Dilemma -- and we test whether LLMs converge to expected outcomes or exhibit deviations due to embedded biases. Our experiments involve four state-of-the-art LLMs and span five natural languages, English, French, Arabic, Vietnamese, and Mandarin Chinese, to assess linguistic sensitivity. For both games, we observe that the final payoffs are influenced by agents characteristics such as personality traits or knowledge of repeated rounds. Moreover, we uncover an unexpected sensitivity of the final payoffs to the choice of languages, which should warn against indiscriminate application of LLMs in cybersecurity applications and call for in-depth studies, as LLMs may behave differently when deployed in different countries. We also employ quantitative metrics to evaluate the internal consistency and cross-language stability of LLM agents, to help guide the selection of the most stable LLMs and optimising models for secure applications.

AIMar 14, 2024
Trust AI Regulation? Discerning users are vital to build trust and effective AI regulation

Zainab Alalawi, Paolo Bova, Theodor Cimpeanu et al.

There is general agreement that some form of regulation is necessary both for AI creators to be incentivised to develop trustworthy systems, and for users to actually trust those systems. But there is much debate about what form these regulations should take and how they should be implemented. Most work in this area has been qualitative, and has not been able to make formal predictions. Here, we propose that evolutionary game theory can be used to quantitatively model the dilemmas faced by users, AI creators, and regulators, and provide insights into the possible effects of different regulatory regimes. We show that creating trustworthy AI and user trust requires regulators to be incentivised to regulate effectively. We demonstrate the effectiveness of two mechanisms that can achieve this. The first is where governments can recognise and reward regulators that do a good job. In that case, if the AI system is not too risky for users then some level of trustworthy development and user trust evolves. We then consider an alternative solution, where users can condition their trust decision on the effectiveness of the regulators. This leads to effective regulation, and consequently the development of trustworthy AI and user trust, provided that the cost of implementing regulations is not too high. Our findings highlight the importance of considering the effect of different regulatory regimes from an evolutionary game theoretic perspective.

AIApr 8, 2021
Voluntary safety commitments provide an escape from over-regulation in AI development

The Anh Han, Tom Lenaerts, Francisco C. Santos et al.

With the introduction of Artificial Intelligence (AI) and related technologies in our daily lives, fear and anxiety about their misuse as well as the hidden biases in their creation have led to a demand for regulation to address such issues. Yet blindly regulating an innovation process that is not well understood, may stifle this process and reduce benefits that society may gain from the generated technology, even under the best intentions. In this paper, starting from a baseline model that captures the fundamental dynamics of a race for domain supremacy using AI technology, we demonstrate how socially unwanted outcomes may be produced when sanctioning is applied unconditionally to risk-taking, i.e. potentially unsafe, behaviours. As an alternative to resolve the detrimental effect of over-regulation, we propose a voluntary commitment approach wherein technologists have the freedom of choice between independently pursuing their course of actions or establishing binding agreements to act safely, with sanctioning of those that do not abide to what they pledged. Overall, this work reveals for the first time how voluntary commitments, with sanctions either by peers or an institution, leads to socially beneficial outcomes in all scenarios envisageable in a short-term race towards domain supremacy through AI technology. These results are directly relevant for the design of governance and regulatory policies that aim to ensure an ethical and responsible AI technology development process.

AIDec 30, 2020
Artificial Intelligence Development Races in Heterogeneous Settings

Theodor Cimpeanu, Francisco C. Santos, Luis Moniz Pereira et al.

Regulation of advanced technologies such as Artificial Intelligence (AI) has become increasingly important, given the associated risks and apparent ethical issues. With the great benefits promised from being able to first supply such technologies, safety precautions and societal consequences might be ignored or shortchanged in exchange for speeding up the development, therefore engendering a racing narrative among the developers. Starting from a game-theoretical model describing an idealised technology race in a fully connected world of players, here we investigate how different interaction structures among race participants can alter collective choices and requirements for regulatory actions. Our findings indicate that, when participants portray a strong diversity in terms of connections and peer-influence (e.g., when scale-free networks shape interactions among parties), the conflicts that exist in homogeneous settings are significantly reduced, thereby lessening the need for regulatory actions. Furthermore, our results suggest that technology governance and regulation may profit from the world's patent heterogeneity and inequality among firms and nations, so as to enable the design and implementation of meticulous interventions on a minority of participants, which is capable of influencing an entire population towards an ethical and sustainable use of advanced technologies.

AIOct 1, 2020
Mediating Artificial Intelligence Developments through Negative and Positive Incentives

The Anh Han, Luis Moniz Pereira, Tom Lenaerts et al.

The field of Artificial Intelligence (AI) is going through a period of great expectations, introducing a certain level of anxiety in research, business and also policy. This anxiety is further energised by an AI race narrative that makes people believe they might be missing out. Whether real or not, a belief in this narrative may be detrimental as some stake-holders will feel obliged to cut corners on safety precautions, or ignore societal consequences just to "win". Starting from a baseline model that describes a broad class of technology races where winners draw a significant benefit compared to others (such as AI advances, patent race, pharmaceutical technologies), we investigate here how positive (rewards) and negative (punishments) incentives may beneficially influence the outcomes. We uncover conditions in which punishment is either capable of reducing the development speed of unsafe participants or has the capacity to reduce innovation through over-regulation. Alternatively, we show that, in several scenarios, rewarding those that follow safety measures may increase the development speed while ensuring safe choices. Moreover, in {the latter} regimes, rewards do not suffer from the issue of over-regulation as is the case for punishment. Overall, our findings provide valuable insights into the nature and kinds of regulatory actions most suitable to improve safety compliance in the contexts of both smooth and sudden technological shifts.

MASep 24, 2020
Evolution of Coordination in Pairwise and Multi-player Interactions via Prior Commitments

Ogbo Ndidi Bianca, Aiman Elgarig, The Anh Han

Upon starting a collective endeavour, it is important to understand your partners' preferences and how strongly they commit to a common goal. Establishing a prior commitment or agreement in terms of posterior benefits and consequences from those engaging in it provides an important mechanism for securing cooperation. Resorting to methods from Evolutionary Game Theory (EGT), here we analyse how prior commitments can also be adopted as a tool for enhancing coordination when its outcomes exhibit an asymmetric payoff structure, in both pairwise and multiparty interactions. Arguably, coordination is more complex to achieve than cooperation since there might be several desirable collective outcomes in a coordination problem (compared to mutual cooperation, the only desirable collective outcome in cooperation dilemmas). Our analysis, both analytically and via numerical simulations, shows that whether prior commitment would be a viable evolutionary mechanism for enhancing coordination and the overall population social welfare strongly depends on the collective benefit and severity of competition, and more importantly, how asymmetric benefits are resolved in a commitment deal. Moreover, in multiparty interactions, prior commitments prove to be crucial when a high level of group diversity is required for optimal coordination. The results are robust for different selection intensities. Overall, our analysis provides new insights into the complexity and beauty of behavioral evolution driven by humans' capacity for commitment, as well as for the design of self-organised and distributed multi-agent systems for ensuring coordination among autonomous agents.

GTJul 22, 2020
When to (or not to) trust intelligent machines: Insights from an evolutionary game theory analysis of trust in repeated games

The Anh Han, Cedric Perret, Simon T. Powers

The actions of intelligent agents, such as chatbots, recommender systems, and virtual assistants are typically not fully transparent to the user. Consequently, using such an agent involves the user exposing themselves to the risk that the agent may act in a way opposed to the user's goals. It is often argued that people use trust as a cognitive shortcut to reduce the complexity of such interactions. Here we formalise this by using the methods of evolutionary game theory to study the viability of trust-based strategies in repeated games. These are reciprocal strategies that cooperate as long as the other player is observed to be cooperating. Unlike classic reciprocal strategies, once mutual cooperation has been observed for a threshold number of rounds they stop checking their co-player's behaviour every round, and instead only check with some probability. By doing so, they reduce the opportunity cost of verifying whether the action of their co-player was actually cooperative. We demonstrate that these trust-based strategies can outcompete strategies that are always conditional, such as Tit-for-Tat, when the opportunity cost is non-negligible. We argue that this cost is likely to be greater when the interaction is between people and intelligent agents, because of the reduced transparency of the agent. Consequently, we expect people to use trust-based strategies more frequently in interactions with intelligent agents. Our results provide new, important insights into the design of mechanisms for facilitating interactions between humans and intelligent agents, where trust is an essential factor.

CYJul 26, 2019
To regulate or not: a social dynamics analysis of the race for AI supremacy

The Anh Han, Luis Moniz Pereira, Francisco C. Santos et al.

Rapid technological advancements in AI as well as the growing deployment of intelligent technologies in new application domains are currently driving the competition between businesses, nations and regions. This race for technological supremacy creates a complex ecology of choices that may lead to negative consequences, in particular, when ethical and safety procedures are underestimated or even ignored. As a consequence, different actors are urging to consider both the normative and social impact of these technological advancements. As there is no easy access to data describing this AI race, theoretical models are necessary to understand its dynamics, allowing for the identification of when, how and which procedures need to be put in place to favour outcomes beneficial for all. We show that, next to the risks of setbacks and being reprimanded for unsafe behaviour, the time-scale in which AI supremacy can be achieved plays a crucial role. When this supremacy can be achieved in a short term, those who completely ignore the safety precautions are bound to win the race but at a cost to society, apparently requiring regulatory actions. Our analysis reveals that blindly imposing regulations may not have anticipated effect as only for specific conditions a dilemma arises between what individually preferred and globally beneficial. Similar observations can be made for the long-term development case. Yet different from the short term situation, certain conditions require the promotion of risk-taking as opposed to compliance to safety regulations in order to improve social welfare. These results remain robust when two or several actors are involved in the race and when collective rather than individual setbacks are produced by risk-taking behaviour. When defining codes of conduct and regulatory policies for AI, a clear understanding about the time-scale of the race is required.