Modelling Opinion Dynamics at Scale with Deep MARL
For researchers studying opinion dynamics and misinformation, this work provides a scalable MARL-based model that reveals how conformity impacts collective accuracy in different network sizes.
This paper introduces a GPU-accelerated multi-agent reinforcement learning framework for opinion dynamics, scaling to 1000 agents. It finds that high conformity in large social networks reduces collective accuracy and promotes dishonesty, while small networks are less affected, suggesting a mismatch between evolved human heuristics and modern social media.
Modelling opinion dynamics typically relies on hand-crafted local interaction rules to study emergent macroscopic phenomena such as consensus and polarisation. In contrast, multi-agent reinforcement learning (MARL) enables agents to learn such behaviours directly by optimising simple rewards. To explore the potential of MARL for opinion dynamics, we introduce a GPU-accelerated consensus and truth-finding game that scales to populations of up to 1000 agents, comparable to many real-world social sub-networks. To prevent unrealistic conventions, we extend other-play to general-sum social interactions. We next validate our model on a subset of the Bluesky network by recovering agent importance structures from graph topology alone via a learned attention layer, finding that highly conforming populations most closely match human data. In large social media networks such high levels of conformity significantly reduce collective accuracy and promote dishonest agents that lie to fit in. By contrast, small, dynamic hunter-gatherer networks are less affected; here, conformity can even improve collective agreement. This suggests a mismatch between evolved human conformity heuristics and modern social media environments as a potential contributor to misinformation.