Nicanor Quijano

AI
h-index60
7papers
33citations
Novelty48%
AI Score42

7 Papers

AIDec 3, 2025
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia

Chandler Smith, Marwa Abdulhai, Manfred Diaz et al.

Large Language Model (LLM) agents have demonstrated impressive capabilities for social interaction and are increasingly being deployed in situations where they might engage with both human and artificial agents. These interactions represent a critical frontier for LLM-based agents, yet existing evaluation methods fail to measure how well these capabilities generalize to novel social situations. In this paper, we introduce a method for evaluating the ability of LLM-based agents to cooperate in zero-shot, mixed-motive environments using Concordia, a natural language multi-agent simulation environment. Our method measures general cooperative intelligence by testing an agent's ability to identify and exploit opportunities for mutual gain across diverse partners and contexts. We present empirical results from the NeurIPS 2024 Concordia Contest, where agents were evaluated on their ability to achieve mutual gains across a suite of diverse scenarios ranging from negotiation to collective action problems. Our findings reveal significant gaps between current agent capabilities and the robust generalization required for reliable cooperation, particularly in scenarios demanding persuasion and norm enforcement.

MASep 20, 2024
Cooperative Resilience in Artificial Intelligence Multiagent Systems

Manuela Chacon-Chamorro, Luis Felipe Giraldo, Nicanor Quijano et al.

Resilience refers to the ability of systems to withstand, adapt to, and recover from disruptive events. While studies on resilience have attracted significant attention across various research domains, the precise definition of this concept within the field of cooperative artificial intelligence remains unclear. This paper addresses this gap by proposing a clear definition of `cooperative resilience' and outlining a methodology for its quantitative measurement. The methodology is validated in an environment with RL-based and LLM-augmented autonomous agents, subjected to environmental changes and the introduction of agents with unsustainable behaviors. These events are parameterized to create various scenarios for measuring cooperative resilience. The results highlight the crucial role of resilience metrics in analyzing how the collective system prepares for, resists, recovers from, sustains well-being, and transforms in the face of disruptions. These findings provide foundational insights into the definition, measurement, and preliminary analysis of cooperative resilience, offering significant implications for the broader field of AI. Moreover, the methodology and metrics developed here can be adapted to a wide range of AI applications, enhancing the reliability and effectiveness of AI in dynamic and unpredictable environments.

MAJan 29
Learning Reward Functions for Cooperative Resilience in Multi-Agent Systems

Manuela Chacon-Chamorro, Luis Felipe Giraldo, Nicanor Quijano

Multi-agent systems often operate in dynamic and uncertain environments, where agents must not only pursue individual goals but also safeguard collective functionality. This challenge is especially acute in mixed-motive multi-agent systems. This work focuses on cooperative resilience, the ability of agents to anticipate, resist, recover, and transform in the face of disruptions, a critical yet underexplored property in Multi-Agent Reinforcement Learning. We study how reward function design influences resilience in mixed-motive settings and introduce a novel framework that learns reward functions from ranked trajectories, guided by a cooperative resilience metric. Agents are trained in a suite of social dilemma environments using three reward strategies: i) traditional individual reward; ii) resilience-inferred reward; and iii) hybrid that balance both. We explore three reward parameterizations-linear models, hand-crafted features, and neural networks, and employ two preference-based learning algorithms to infer rewards from behavioral rankings. Our results demonstrate that hybrid strategy significantly improve robustness under disruptions without degrading task performance and reduce catastrophic outcomes like resource overuse. These findings underscore the importance of reward design in fostering resilient cooperation, and represent a step toward developing robust multi-agent systems capable of sustaining cooperation in uncertain environments.

AIMar 18, 2024
Can LLM-Augmented autonomous agents cooperate?, An evaluation of their cooperative capabilities through Melting Pot

Manuel Mosquera, Juan Sebastian Pinzon, Manuel Rios et al.

As the field of AI continues to evolve, a significant dimension of this progression is the development of Large Language Models and their potential to enhance multi-agent artificial intelligence systems. This paper explores the cooperative capabilities of Large Language Model-augmented Autonomous Agents (LAAs) using the well-known Meltin Pot environments along with reference models such as GPT4 and GPT3.5. Preliminary results suggest that while these agents demonstrate a propensity for cooperation, they still struggle with effective collaboration in given environments, emphasizing the need for more robust architectures. The study's contributions include an abstraction layer to adapt Melting Pot game scenarios for LLMs, the implementation of a reusable architecture for LLM-mediated agent development - which includes short and long-term memories and different cognitive modules, and the evaluation of cooperation capabilities using a set of metrics tied to the Melting Pot's "Commons Harvest" game. The paper closes, by discussing the limitations of the current architectural framework and the potential of a new set of modules that fosters better cooperation among LAAs.

AIDec 14, 2025
World Models Unlock Optimal Foraging Strategies in Reinforcement Learning Agents

Yesid Fonseca, Manuel S. Ríos, Nicanor Quijano et al.

Patch foraging involves the deliberate and planned process of determining the optimal time to depart from a resource-rich region and investigate potentially more beneficial alternatives. The Marginal Value Theorem (MVT) is frequently used to characterize this process, offering an optimality model for such foraging behaviors. Although this model has been widely used to make predictions in behavioral ecology, discovering the computational mechanisms that facilitate the emergence of optimal patch-foraging decisions in biological foragers remains under investigation. Here, we show that artificial foragers equipped with learned world models naturally converge to MVT-aligned strategies. Using a model-based reinforcement learning agent that acquires a parsimonious predictive representation of its environment, we demonstrate that anticipatory capabilities, rather than reward maximization alone, drive efficient patch-leaving behavior. Compared with standard model-free RL agents, these model-based agents exhibit decision patterns similar to many of their biological counterparts, suggesting that predictive world models can serve as a foundation for more explainable and biologically grounded decision-making in AI systems. Overall, our findings highlight the value of ecological optimality principles for advancing interpretable and adaptive AI.

LGMay 19, 2023
Understanding the World to Solve Social Dilemmas Using Multi-Agent Reinforcement Learning

Manuel Rios, Nicanor Quijano, Luis Felipe Giraldo

Social dilemmas are situations where groups of individuals can benefit from mutual cooperation but conflicting interests impede them from doing so. This type of situations resembles many of humanity's most critical challenges, and discovering mechanisms that facilitate the emergence of cooperative behaviors is still an open problem. In this paper, we study the behavior of self-interested rational agents that learn world models in a multi-agent reinforcement learning (RL) setting and that coexist in environments where social dilemmas can arise. Our simulation results show that groups of agents endowed with world models outperform all the other tested ones when dealing with scenarios where social dilemmas can arise. We exploit the world model architecture to qualitatively assess the learnt dynamics and confirm that each agent's world model is capable to encode information of the behavior of the changing environment and the other agent's actions. This is the first work that shows that world models facilitate the emergence of complex coordinated behaviors that enable interacting agents to ``understand'' both environmental and social dynamics.

AIMay 16, 2020
Learning Transferable Concepts in Deep Reinforcement Learning

Diego Gomez, Nicanor Quijano, Luis Felipe Giraldo

While humans and animals learn incrementally during their lifetimes and exploit their experience to solve new tasks, standard deep reinforcement learning methods specialize to solve only one task at a time. As a result, the information they acquire is hardly reusable in new situations. Here, we introduce a new perspective on the problem of leveraging prior knowledge to solve future tasks. We show that learning discrete representations of sensory inputs can provide a high-level abstraction that is common across multiple tasks, thus facilitating the transference of information. In particular, we show that it is possible to learn such representations by self-supervision, following an information theoretic approach. Our method is able to learn concepts in locomotive and optimal control tasks that increase the sample efficiency in both known and unknown tasks, opening a new path to endow artificial agents with generalization abilities.