AIFeb 12, 2023Code
MarioGPT: Open-Ended Text2Level Generation through Large Language ModelsShyam Sudhakaran, Miguel González-Duque, Claire Glanois et al.
Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way. However, while generating content with PCG methods is often straightforward, generating meaningful content that reflects specific intentions and constraints remains challenging. Furthermore, many PCG algorithms lack the ability to generate content in an open-ended manner. Recently, Large Language Models (LLMs) have shown to be incredibly effective in many diverse domains. These trained LLMs can be fine-tuned, re-using information and accelerating training for new tasks. Here, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels, in our case Super Mario Bros levels. MarioGPT can not only generate diverse levels, but can be text-prompted for controllable level generation, addressing one of the key challenges of current PCG techniques. As far as we know, MarioGPT is the first text-to-level model and combined with novelty search it enables the generation of diverse levels with varying play-style dynamics (i.e. player paths) and the open-ended discovery of an increasingly diverse range of content. Code available at https://github.com/shyamsn97/mario-gpt.
ROMar 14, 2022Code
Physical Neural Cellular Automata for 2D Shape ClassificationKathryn Walker, Rasmus Berg Palm, Rodrigo Moreno Garcia et al.
Materials with the ability to self-classify their own shape have the potential to advance a wide range of engineering applications and industries. Biological systems possess the ability not only to self-reconfigure but also to self-classify themselves to determine a general shape and function. Previous work into modular robotics systems has only enabled self-recognition and self-reconfiguration into a specific target shape, missing the inherent robustness present in nature to self-classify. In this paper we therefore take advantage of recent advances in deep learning and neural cellular automata, and present a simple modular 2D robotic system that can infer its own class of shape through the local communication of its components. Furthermore, we show that our system can be successfully transferred to hardware which thus opens opportunities for future self-classifying machines. Code available at https://github.com/kattwalker/projectcube. Video available at https://youtu.be/0TCOkE4keyc.
LGJan 31, 2023Code
Skill Decision TransformerShyam Sudhakaran, Sebastian Risi
Recent work has shown that Large Language Models (LLMs) can be incredibly effective for offline reinforcement learning (RL) by representing the traditional RL problem as a sequence modelling problem (Chen et al., 2021; Janner et al., 2021). However many of these methods only optimize for high returns, and may not extract much information from a diverse dataset of trajectories. Generalized Decision Transformers (GDTs) (Furuta et al., 2021) have shown that utilizing future trajectory information, in the form of information statistics, can help extract more information from offline trajectory data. Building upon this, we propose Skill Decision Transformer (Skill DT). Skill DT draws inspiration from hindsight relabelling (Andrychowicz et al., 2017) and skill discovery methods to discover a diverse set of primitive behaviors, or skills. We show that Skill DT can not only perform offline state-marginal matching (SMM), but can discovery descriptive behaviors that can be easily sampled. Furthermore, we show that through purely reward-free optimization, Skill DT is still competitive with supervised offline RL approaches on the D4RL benchmark. The code and videos can be found on our project page: https://github.com/shyamsn97/skill-dt
CVJul 7, 2023Code
Fooling Contrastive Language-Image Pre-trained Models with CLIPMasterPrintsMatthias Freiberger, Peter Kun, Christian Igel et al.
Models leveraging both visual and textual data such as Contrastive Language-Image Pre-training (CLIP), are the backbone of many recent advances in artificial intelligence. In this work, we show that despite their versatility, such models are vulnerable to what we refer to as fooling master images. Fooling master images are capable of maximizing the confidence score of a CLIP model for a significant number of widely varying prompts, while being either unrecognizable or unrelated to the attacked prompts for humans. The existence of such images is problematic as it could be used by bad actors to maliciously interfere with CLIP-trained image retrieval models in production with comparably small effort as a single image can attack many different prompts. We demonstrate how fooling master images for CLIP (CLIPMasterPrints) can be mined using stochastic gradient descent, projected gradient descent, or blackbox optimization. Contrary to many common adversarial attacks, the blackbox optimization approach allows us to mine CLIPMasterPrints even when the weights of the model are not accessible. We investigate the properties of the mined images, and find that images trained on a small number of image captions generalize to a much larger number of semantically related captions. We evaluate possible mitigation strategies, where we increase the robustness of the model and introduce an approach to automatically detect CLIPMasterPrints to sanitize the input of vulnerable models. Finally, we find that vulnerability to CLIPMasterPrints is related to a modality gap in contrastive pre-trained multi-modal networks. Code available at https://github.com/matfrei/CLIPMasterPrints.
NEApr 25, 2022
HyperNCA: Growing Developmental Networks with Neural Cellular AutomataElias Najarro, Shyam Sudhakaran, Claire Glanois et al.
In contrast to deep reinforcement learning agents, biological neural networks are grown through a self-organized developmental process. Here we propose a new hypernetwork approach to grow artificial neural networks based on neural cellular automata (NCA). Inspired by self-organising systems and information-theoretic approaches to developmental biology, we show that our HyperNCA method can grow neural networks capable of solving common reinforcement learning tasks. Finally, we explore how the same approach can be used to build developmental metamorphosis networks capable of transforming their weights to solve variations of the initial RL task.
NEJul 17, 2023
Towards Self-Assembling Artificial Neural Networks through Neural Developmental ProgramsElias Najarro, Shyam Sudhakaran, Sebastian Risi
Biological nervous systems are created in a fundamentally different way than current artificial neural networks. Despite its impressive results in a variety of different domains, deep learning often requires considerable engineering effort to design high-performing neural architectures. By contrast, biological nervous systems are grown through a dynamic self-organizing process. In this paper, we take initial steps toward neural networks that grow through a developmental process that mirrors key properties of embryonic development in biological organisms. The growth process is guided by another neural network, which we call a Neural Developmental Program (NDP) and which operates through local communication alone. We investigate the role of neural growth on different machine learning benchmarks and different optimization methods (evolutionary training, online RL, offline RL, and supervised learning). Additionally, we highlight future research directions and opportunities enabled by having self-organization driving the growth of neural networks.
AIJul 7, 2024
Collective Innovation in Groups of Large Language ModelsEleni Nisioti, Sebastian Risi, Ida Momennejad et al.
Human culture relies on collective innovation: our ability to continuously explore how existing elements in our environment can be combined to create new ones. Language is hypothesized to play a key role in human culture, driving individual cognitive capacities and shaping communication. Yet the majority of models of collective innovation assign no cognitive capacities or language abilities to agents. Here, we contribute a computational study of collective innovation where agents are Large Language Models (LLMs) that play Little Alchemy 2, a creative video game originally developed for humans that, as we argue, captures useful aspects of innovation landscapes not present in previous test-beds. We, first, study an LLM in isolation and discover that it exhibits both useful skills and crucial limitations. We, then, study groups of LLMs that share information related to their behaviour and focus on the effect of social connectivity on collective performance. In agreement with previous human and computational studies, we observe that groups with dynamic connectivity out-compete fully-connected groups. Our work reveals opportunities and challenges for future studies of collective innovation that are becoming increasingly relevant as Generative Artificial Intelligence algorithms and humans innovate alongside each other.
NEJun 14, 2022
Severe Damage Recovery in Evolving Soft Robots through Differentiable ProgrammingKazuya Horibe, Kathryn Walker, Rasmus Berg Palm et al.
Biological systems are very robust to morphological damage, but artificial systems (robots) are currently not. In this paper we present a system based on neural cellular automata, in which locomoting robots are evolved and then given the ability to regenerate their morphology from damage through gradient-based training. Our approach thus combines the benefits of evolution to discover a wide range of different robot morphologies, with the efficiency of supervised training for robustness through differentiable update rules. The resulting neural cellular automata are able to grow virtual robots capable of regaining more than 80\% of their functionality, even after severe types of morphological damage.
ROMar 22, 2022
A Unified Substrate for Body-Brain Co-evolutionSidney Pontes-Filho, Kathryn Walker, Elias Najarro et al.
The discovery of complex multicellular organism development took millions of years of evolution. The genome of such a multicellular organism guides the development of its body from a single cell, including its control system. Our goal is to imitate this natural process using a single neural cellular automaton (NCA) as a genome for modular robotic agents. In the introduced approach, called Neural Cellular Robot Substrate (NCRS), a single NCA guides the growth of a robot and the cellular activity which controls the robot during deployment. We also introduce three benchmark environments, which test the ability of the approach to grow different robot morphologies. In this paper, NCRSs are trained with covariance matrix adaptation evolution strategy (CMA-ES), and covariance matrix adaptation MAP-Elites (CMA-ME) for quality diversity, which we show leads to more diverse robot morphologies with higher fitness scores. While the NCRS can solve the easier tasks from our benchmark environments, the success rate reduces when the difficulty of the task increases. We discuss directions for future work that may facilitate the use of the NCRS approach for more complex domains.
CVJul 5, 2024Code
LayerShuffle: Enhancing Robustness in Vision Transformers by Randomizing Layer Execution OrderMatthias Freiberger, Peter Kun, Anders Sundnes Løvlie et al.
Due to their architecture and how they are trained, artificial neural networks are typically not robust toward pruning or shuffling layers at test time. However, such properties would be desirable for different applications, such as distributed neural network architectures where the order of execution cannot be guaranteed or parts of the network can fail during inference. In this work, we address these issues through a number of training approaches for vision transformers whose most important component is randomizing the execution order of attention modules at training time. With our proposed approaches, vision transformers are capable to adapt to arbitrary layer execution orders at test time assuming one tolerates a reduction (about 20\%) in accuracy at the same model size. We analyse the feature representations of our trained models as well as how each layer contributes to the models prediction based on its position during inference. Our analysis shows that layers learn to contribute differently based on their position in the network. Finally, we layer-prune our models at test time and find that their performance declines gracefully. Code available at https://github.com/matfrei/layershuffle.
NEApr 25, 2022
Goal-Guided Neural Cellular Automata: Learning to Control Self-Organising SystemsShyam Sudhakaran, Elias Najarro, Sebastian Risi
Inspired by cellular growth and self-organization, Neural Cellular Automata (NCAs) have been capable of "growing" artificial cells into images, 3D structures, and even functional machines. NCAs are flexible and robust computational systems but -- similarly to many other self-organizing systems -- inherently uncontrollable during and after their growth process. We present an approach to control these type of systems called Goal-Guided Neural Cellular Automata (GoalNCA), which leverages goal encodings to control cell behavior dynamically at every step of cellular growth. This approach enables the NCA to continually change behavior, and in some cases, generalize its behavior to unseen scenarios. We also demonstrate the robustness of the NCA with its ability to preserve task performance, even when only a portion of cells receive goal information.
LGMay 31, 2022
Mario Plays on a Manifold: Generating Functional Content in Latent Space through Differential GeometryMiguel González-Duque, Rasmus Berg Palm, Søren Hauberg et al.
Deep generative models can automatically create content of diverse types. However, there are no guarantees that such content will satisfy the criteria necessary to present it to end-users and be functional, e.g. the generated levels could be unsolvable or incoherent. In this paper we study this problem from a geometric perspective, and provide a method for reliable interpolation and random walks in the latent spaces of Categorical VAEs based on Riemannian geometry. We test our method with "Super Mario Bros" and "The Legend of Zelda" levels, and against simpler baselines inspired by current practice. Results show that the geometry we propose is better able to interpolate and sample, reliably staying closer to parts of the latent space that decode to playable content.
57.3NEApr 23Code
Adversarial Coevolutionary Illumination with Generational Adversarial MAP-ElitesTimothée Anne, Noah Syrkis, Meriem Elhosni et al.
Quality-Diversity (QD) algorithms seek to discover diverse, high-performing solutions across a behavior space, in contrast to conventional optimization methods that target a single optimum. Adversarial problems present unique challenges for QD approaches, as the competing nature of opposing sides creates interdependencies that complicate the evolution process. Existing QD methods applied to such scenarios typically fix one side, constraining the open-endedness. We present Generational Adversarial MAP-Elites (GAME), a coevolutionary QD algorithm that evolves both sides by alternating which side is evolved at each generation. By integrating a vision embedding model (VEM), our approach eliminates the need for domain-specific behavior descriptors and instead operates on video. We validate GAME across three distinct adversarial domains: a multi-agent battle game, a soft-robot wrestling environment, and a deck building game. We validate that all its components are necessary, that the VEM is effective in two different domains, and that GAME finds better solutions than one-sided QD baselines. Our experiments reveal several evolutionary phenomena, including arms race-like dynamics, enhanced novelty through generational extinction, and the preservation of neutral mutations as crucial stepping stones toward the highest performance. While GAME successfully illuminates all three adversarial problems, its capacity for truly open-ended discovery remains constrained by the nature of the search spaces used in this paper. These findings show GAME's broad applicability and highlight opportunities for future research into open-ended adversarial coevolution. Code and videos are available at: https://github.com/Timothee-ANNE/GAME
LGMay 12, 2022
Minimal Neural Network Models for Permutation Invariant AgentsJoachim Winther Pedersen, Sebastian Risi
Organisms in nature have evolved to exhibit flexibility in face of changes to the environment and/or to themselves. Artificial neural networks (ANNs) have proven useful for controlling of artificial agents acting in environments. However, most ANN models used for reinforcement learning-type tasks have a rigid structure that does not allow for varying input sizes. Further, they fail catastrophically if inputs are presented in an ordering unseen during optimization. We find that these two ANN inflexibilities can be mitigated and their solutions are simple and highly related. For permutation invariance, no optimized parameters can be tied to a specific index of the input elements. For size invariance, inputs must be projected onto a common space that does not grow with the number of projections. Based on these restrictions, we construct a conceptually simple model that exhibit flexibility most ANNs lack. We demonstrate the model's properties on multiple control problems, and show that it can cope with even very rapid permutations of input indices, as well as changes in input size. Ablation studies show that is possible to achieve these properties with simple feedforward structures, but that it is much easier to optimize recurrent structures.
59.1NEMay 15Code
Tournament Informed Adversarial Quality DiversityTimothée Anne, Noah Syrkis, Meriem Elhosni et al.
Quality diversity (QD) is a branch of evolutionary computation that seeks high-quality and behaviorally diverse solutions to a problem. While adversarial problems are common, classical QD cannot be easily applied to them, as both the fitness and the behavior depend on the opposing solutions. Recently, Generational Adversarial MAP-Elites (GAME) has been proposed to coevolve both sides of an adversarial problem by alternating the execution of a multi-task QD algorithm against previous elites, called tasks. The original algorithm selects new tasks based on a behavioral criterion, which may lead to undesired dynamics due to inter-side dependencies. In addition, comparing sets of solutions cannot be done directly using classical QD measures due to side dependencies. In this paper, we (1) use an inter-variants tournament to compare the sets of solutions, ensuring a fair comparison, with 6 measures of quality and diversity, and (2) propose two tournament-informed task selection methods to promote higher quality and diversity at each generation. We evaluate the variants across three adversarial problems: Pong, a Cat-and-mouse game, and a Pursuers-and-evaders game. We show that the tournament-informed task selection method leads to higher adversarial quality and diversity. We hope that this work will help further advance adversarial quality diversity. Code, videos, and supplementary material are available at https://github.com/Timothee-ANNE/GAME_tournament_informed.
96.5LGMay 21
Vector Policy Optimization: Training for Diversity Improves Test-Time SearchRyan Bahlous-Boldi, Isha Puri, Idan Shenfeld et al.
Language models must now generalize out of the box to novel environments and work inside inference-scaling search procedures, such as AlphaEvolve, that select rollouts with a variety of task-specific reward functions. Unfortunately, the standard paradigm of LLM post-training optimizes a pre-specified scalar reward, often leading current LLMs to produce low-entropy response distributions and thus to struggle at displaying the diversity that inference-time search will require. We propose Vector Policy Optimization (VPO), an RL algorithm that explicitly trains policies to anticipate diverse downstream reward functions and to produce diverse solutions. VPO exploits that rewards are often vector-valued in practice, like per-test-case correctness in code generation or, say, multiple different user personas or reward models. VPO is essentially a drop-in replacement for the GRPO advantage estimator, but it trains the LLM to output a set of solutions where individual solutions specialize to different trade-offs in the vector reward space. Across four tasks, VPO matches or beats the strongest scalar RL baselines on test-time search (e.g. pass@k and best@k), with the gap widening as the search budget grows. For evolutionary search, VPO models unlock problems that GRPO models cannot solve at all. As test-time search becomes more standardized, optimizing for diversity may need to become the default post-training objective.
54.4AIMay 14
Learning Developmental Scaffoldings to Guide Self-OrganisationMilton L. Montero, Elias Najarro, Jakob Schauser et al.
From subcellular structures to entire organisms, many natural systems generate complex organisation through self-organisation: local interactions that collectively give rise to global structure without any blueprint of the outcome. Yet a significant portion of the information driving such processes is not produced by self-organisation itself, instead, it is often offloaded to initial conditions of the system. Biological development is a prime example, where maternal pre-patterns encode positional and symmetry-breaking information that scaffolds the self-organising process. From maternal morphogen gradients in early embryogenesis to tissue-level morphogenetic pre-patterns guiding organ formation, this transfer of information to initial conditions, analogous to a memory-compute trade-off in computational systems, is a fundamental part of developmental processes. In this work, we study this offloading phenomenon by introducing a model that jointly learns both the self-organisation rules and the pre-patterns, allowing their interplay to be varied and measured under controlled conditions: a Neural Cellular Automaton (NCA) paired with a learned coordinate-based pattern generator (SIREN), both trained simultaneously to generate a set of patterns. We provide information-theoretic analyses of how information is distributed between pre-patterns and the self-organising process, and show that jointly learning both components yields improvements in robustness, encoding capacity, and symmetry breaking over purely self-organising alternatives. Our analysis further suggests that effective pre-patterns do not simply approximate their targets; rather, they bias the developmental dynamics in ways that facilitate convergence, pointing to a non-trivial relationship between the structure of initial conditions and the dynamics of self-organisation.
95.0AIApr 1Code
In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language ModelsSam Earle, Kay Arulkumaran, Andrew Dai et al.
We are in the midst of large-scale industrial and academic efforts to automate the processes of scientific, technological and creative production through AI-driven assistants. Historically, a fundamental property of these processes in their human form has been their open-endedness: their capacity for generating a seemingly endless supply of novel and meaningful new forms. Do artificial agents have any capacity for such fruitful unguided discovery? To answer this question, we turn to Picbreeder, the canonical exemplar of human-driven open-ended search, in which users collaboratively generated a diverse library of images through interactive evolution of small neural networks. We replicate Picbreeder, replacing human users with frontier Vision Language Models (VLMs). We observe clear qualitative differences between the output of our system and the historical human baseline, and attempt to characterize them using metrics of phylogenetic complexity and visual and semantic salience and novelty. In an effort to identify some of the causal factors contributing these differences, we study the addition of exploratory noise to the agents' selection process, of behavioral diversity between agents, and of narrative momentum in the form of memory of past actions. We make our code available at https://github.com/smearle/picbreeder-vlm.
NEDec 18, 2025
Hypernetworks That Evolve ThemselvesJoachim Winther Pedersen, Erwan Plantec, Eleni Nisioti et al.
How can neural networks evolve themselves without relying on external optimizers? We propose Self-Referential Graph HyperNetworks, systems where the very machinery of variation and inheritance is embedded within the network. By uniting hypernetworks, stochastic parameter generation, and graph-based representations, Self-Referential GHNs mutate and evaluate themselves while adapting mutation rates as selectable traits. Through new reinforcement learning benchmarks with environmental shifts (CartPoleSwitch, LunarLander-Switch), Self-Referential GHNs show swift, reliable adaptation and emergent population dynamics. In the locomotion benchmark Ant-v5, they evolve coherent gaits, showing promising fine-tuning capabilities by autonomously decreasing variation in the population to concentrate around promising solutions. Our findings support the idea that evolvability itself can emerge from neural self-reference. Self-Referential GHNs reflect a step toward synthetic systems that more closely mirror biological evolution, offering tools for autonomous, open-ended learning agents.
LGMar 15, 2021Code
Growing 3D Artefacts and Functional Machines with Neural Cellular AutomataShyam Sudhakaran, Djordje Grbic, Siyan Li et al.
Neural Cellular Automata (NCAs) have been proven effective in simulating morphogenetic processes, the continuous construction of complex structures from very few starting cells. Recent developments in NCAs lie in the 2D domain, namely reconstructing target images from a single pixel or infinitely growing 2D textures. In this work, we propose an extension of NCAs to 3D, utilizing 3D convolutions in the proposed neural network architecture. Minecraft is selected as the environment for our automaton since it allows the generation of both static structures and moving machines. We show that despite their simplicity, NCAs are capable of growing complex entities such as castles, apartment blocks, and trees, some of which are composed of over 3,000 blocks. Additionally, when trained for regeneration, the system is able to regrow parts of simple functional machines, significantly expanding the capabilities of simulated morphogenetic systems. The code for the experiment in this paper can be found at: https://github.com/real-itu/3d-artefacts-nca.
NEFeb 4, 2021Code
Regenerating Soft Robots through Neural Cellular AutomataKazuya Horibe, Kathryn Walker, Sebastian Risi
Morphological regeneration is an important feature that highlights the environmental adaptive capacity of biological systems. Lack of this regenerative capacity significantly limits the resilience of machines and the environments they can operate in. To aid in addressing this gap, we develop an approach for simulated soft robots to regrow parts of their morphology when being damaged. Although numerical simulations using soft robots have played an important role in their design, evolving soft robots with regenerative capabilities have so far received comparable little attention. Here we propose a model for soft robots that regenerate through a neural cellular automata. Importantly, this approach only relies on local cell information to regrow damaged components, opening interesting possibilities for physical regenerable soft robots in the future. Our approach allows simulated soft robots that are damaged to partially regenerate their original morphology through local cell interactions alone and regain some of their ability to locomote. These results take a step towards equipping artificial systems with regenerative capacities and could potentially allow for more robust operations in a variety of situations and environments. The code for the experiments in this paper is available at: \url{github.com/KazuyaHoribe/RegeneratingSoftRobots}.
AIDec 8, 2020Code
EvoCraft: A New Challenge for Open-EndednessDjordje Grbic, Rasmus Berg Palm, Elias Najarro et al.
This paper introduces EvoCraft, a framework for Minecraft designed to study open-ended algorithms. We introduce an API that provides an open-source Python interface for communicating with Minecraft to place and track blocks. In contrast to previous work in Minecraft that focused on learning to play the game, the grand challenge we pose here is to automatically search for increasingly complex artifacts in an open-ended fashion. Compared to other environments used to study open-endedness, Minecraft allows the construction of almost any kind of structure, including actuated machines with circuits and mechanical components. We present initial baseline results in evolving simple Minecraft creations through both interactive and automated evolution. While evolution succeeds when tasked to grow a structure towards a specific target, it is unable to find a solution when rewarded for creating a simple machine that moves. Thus, EvoCraft offers a challenging new environment for automated search methods (such as evolution) to find complex artifacts that we hope will spur the development of more open-ended algorithms. A Python implementation of the EvoCraft framework is available at: https://github.com/real-itu/Evocraft-py.
NEJul 6, 2020Code
Meta-Learning through Hebbian Plasticity in Random NetworksElias Najarro, Sebastian Risi
Lifelong learning and adaptability are two defining aspects of biological agents. Modern reinforcement learning (RL) approaches have shown significant progress in solving complex tasks, however once training is concluded, the found solutions are typically static and incapable of adapting to new information or perturbations. While it is still not completely understood how biological brains learn and adapt so efficiently from experience, it is believed that synaptic plasticity plays a prominent role in this process. Inspired by this biological mechanism, we propose a search method that, instead of optimizing the weight parameters of neural networks directly, only searches for synapse-specific Hebbian learning rules that allow the network to continuously self-organize its weights during the lifetime of the agent. We demonstrate our approach on several reinforcement learning tasks with different sensory modalities and more than 450K trainable plasticity parameters. We find that starting from completely random weights, the discovered Hebbian rules enable an agent to navigate a dynamical 2D-pixel environment; likewise they allow a simulated 3D quadrupedal robot to learn how to walk while adapting to morphological damage not seen during training and in the absence of any explicit reward or error signal in less than 100 timesteps. Code is available at https://github.com/enajx/HebbianMetaLearning.
AIJul 12, 2017Code
Learning Macromanagement in StarCraft from Replays using Deep LearningNiels Justesen, Sebastian Risi
The real-time strategy game StarCraft has proven to be a challenging environment for artificial intelligence techniques, and as a result, current state-of-the-art solutions consist of numerous hand-crafted modules. In this paper, we show how macromanagement decisions in StarCraft can be learned directly from game replays using deep learning. Neural networks are trained on 789,571 state-action pairs extracted from 2,005 replays of highly skilled players, achieving top-1 and top-3 error rates of 54.6% and 22.9% in predicting the next build action. By integrating the trained network into UAlbertaBot, an open source StarCraft bot, the system can significantly outperform the game's built-in Terran bot, and play competitively against UAlbertaBot with a fixed rush strategy. To our knowledge, this is the first time macromanagement tasks are learned directly from replays in StarCraft. While the best hand-crafted strategies are still the state-of-the-art, the deep network approach is able to express a wide range of different strategies and thus improving the network's performance further with deep reinforcement learning is an immediately promising avenue for future research. Ultimately this approach could lead to strong StarCraft bots that are less reliant on hard-coded strategies.
AIFeb 18
Simple Baselines are Competitive with Code EvolutionYonatan Gideoni, Sebastian Risi, Yarin Gal
Code evolution is a family of techniques that rely on large language models to search through possible computer programs by evolving or mutating existing code. Many proposed code evolution pipelines show impressive performance but are often not compared to simpler baselines. We test how well two simple baselines do over three domains: finding better mathematical bounds, designing agentic scaffolds, and machine learning competitions. We find that simple baselines match or exceed much more sophisticated methods in all three. By analyzing these results we find various shortcomings in how code evolution is both developed and used. For the mathematical bounds, a problem's search space and domain knowledge in the prompt are chiefly what dictate a search's performance ceiling and efficiency, with the code evolution pipeline being secondary. Thus, the primary challenge in finding improved bounds is designing good search spaces, which is done by domain experts, and not the search itself. When designing agentic scaffolds we find that high variance in the scaffolds coupled with small datasets leads to suboptimal scaffolds being selected, resulting in hand-designed majority vote scaffolds performing best. We propose better evaluation methods that reduce evaluation stochasticity while keeping the code evolution economically feasible. We finish with a discussion of avenues and best practices to enable more rigorous code evolution in future work.
HCMar 28, 2024
Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art ExplorationLouie Søs Meyer, Johanne Engel Aaen, Anitamalina Regitse Tranberg et al.
This Research through Design paper explores how object detection may be applied to a large digital art museum collection to facilitate new ways of encountering and experiencing art. We present the design and evaluation of an interactive application called SMKExplore, which allows users to explore a museum's digital collection of paintings by browsing through objects detected in the images, as a novel form of open-ended exploration. We provide three contributions. First, we show how an object detection pipeline can be integrated into a design process for visual exploration. Second, we present the design and development of an app that enables exploration of an art museum's collection. Third, we offer reflections on future possibilities for museums and HCI researchers to incorporate object detection techniques into the digitalization of museums.
27.9LGApr 30
When Does Structure Matter in Continual Learning? Dimensionality Controls When Modularity Shapes Representational GeometryKathrin Korte, Joachim Winter Pedersen, Eleni Nisioti et al.
To preserve previously learned representations, continual learning systems must strike a balance between plasticity, the ability to acquire new knowledge, and stability. This stability-plasticity dilemma affects how representations can be reused across tasks: shared structure enables transfer when tasks are similar but may also induce interference when new learning disrupts existing representations. However, it remains unclear when and why structural separation influences this trade-off. In this study, we examine how network architecture, task similarity, and representational dimensionality jointly shape learning in a sequential task paradigm inspired by transfer-interference studies. We compare a task-partitioned modular recurrent network with a single-module baseline by systematically varying task similarity (low, medium, high) and the scale of weight initialization, which induces different learning regimes that we empirically characterize through the effective dimensionality of the learned representations. We find that architecture has minimal impact in high-dimensional regimes where representations are sufficiently unconstrained to accommodate multiple tasks without strong interference. In contrast, in lower-dimensional (rich) regimes, architectural separation is decisive: modular networks exhibit graded alignment of task-specific subspaces with overlap for similar tasks, partial orthogonalization for moderately dissimilar tasks, and stronger separation for dissimilar tasks. This graded geometry is absent in the single network baseline. Our findings suggest that representational dimensionality acts as a key organizing variable governing when structural separation becomes functionally relevant, and highlight adaptive geometry as a central principle for designing continual learning systems.
LGMay 8, 2025
Continuous Thought MachinesLuke Darlow, Ciaran Regan, Sebastian Risi et al.
Biological brains demonstrate complex neural activity, where neural dynamics are critical to how brains process information. Most artificial neural networks ignore the complexity of individual neurons. We challenge that paradigm. By incorporating neuron-level processing and synchronization, we reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. The CTM has two innovations: (1) neuron-level temporal processing, where each neuron uses unique weight parameters to process incoming histories; and (2) neural synchronization as a latent representation. The CTM aims to strike a balance between neuron abstractions and biological realism. It operates at a level of abstraction that effectively captures essential temporal dynamics while remaining computationally tractable. We demonstrate the CTM's performance and versatility across a range of tasks, including solving 2D mazes, ImageNet-1K classification, parity computation, and more. Beyond displaying rich internal representations and offering a natural avenue for interpretation owing to its internal process, the CTM is able to perform tasks that require complex sequential reasoning. The CTM can also leverage adaptive compute, where it can stop earlier for simpler tasks, or keep computing when faced with more challenging instances. The goal of this work is to share the CTM and its associated innovations, rather than pushing for new state-of-the-art results. To that end, we believe the CTM represents a significant step toward developing more biologically plausible and powerful artificial intelligence systems. We provide an accompanying interactive online demonstration at https://pub.sakana.ai/ctm/ and an extended technical report at https://pub.sakana.ai/ctm/paper .
NEApr 6, 2024
Structurally Flexible Neural Networks: Evolving the Building Blocks for General AgentsJoachim Winther Pedersen, Erwan Plantec, Eleni Nisioti et al.
Artificial neural networks used for reinforcement learning are structurally rigid, meaning that each optimized parameter of the network is tied to its specific placement in the network structure. It also means that a network only works with pre-defined and fixed input- and output sizes. This is a consequence of having the number of optimized parameters being directly dependent on the structure of the network. Structural rigidity limits the ability to optimize parameters of policies across multiple environments that do not share input and output spaces. Here, we evolve a set of neurons and plastic synapses each represented by a gated recurrent unit (GRU). During optimization, the parameters of these fundamental units of a neural network are optimized in different random structural configurations. Earlier work has shown that parameter sharing between units is important for making structurally flexible neurons We show that it is possible to optimize a set of distinct neuron- and synapse types allowing for a mitigation of the symmetry dilemma. We demonstrate this by optimizing a single set of neurons and synapses to solve multiple reinforcement learning control tasks simultaneously.
AIDec 16, 2024
Harnessing Language for Coordination: A Framework and Benchmark for LLM-Driven Multi-Agent ControlTimothée Anne, Noah Syrkis, Meriem Elhosni et al.
Large Language Models (LLMs) have demonstrated remarkable performance across various tasks. Their potential to facilitate human coordination with many agents is a promising but largely under-explored area. Such capabilities would be helpful in disaster response, urban planning, and real-time strategy scenarios. In this work, we introduce (1) a real-time strategy game benchmark designed to evaluate these abilities and (2) a novel framework we term HIVE. HIVE empowers a single human to coordinate swarms of up to 2,000 agents through a natural language dialog with an LLM. We present promising results on this multi-agent benchmark, with our hybrid approach solving tasks such as coordinating agent movements, exploiting unit weaknesses, leveraging human annotations, and understanding terrain and strategic points. Our findings also highlight critical limitations of current models, including difficulties in processing spatial visual information and challenges in formulating long-term strategic plans. This work sheds light on the potential and limitations of LLMs in human-swarm coordination, paving the way for future research in this area. The HIVE project page, hive.syrkis.com, includes videos of the system in action.
NEMay 14, 2024
Growing Artificial Neural Networks for Control: the Role of Neuronal DiversityEleni Nisioti, Erwan Plantec, Milton Montero et al.
In biological evolution complex neural structures grow from a handful of cellular ingredients. As genomes in nature are bounded in size, this complexity is achieved by a growth process where cells communicate locally to decide whether to differentiate, proliferate and connect with other cells. This self-organisation is hypothesized to play an important part in the generalisation, and robustness of biological neural networks. Artificial neural networks (ANNs), on the other hand, are traditionally optimized in the space of weights. Thus, the benefits and challenges of growing artificial neural networks remain understudied. Building on the previously introduced Neural Developmental Programs (NDP), in this work we present an algorithm for growing ANNs that solve reinforcement learning tasks. We identify a key challenge: ensuring phenotypic complexity requires maintaining neuronal diversity, but this diversity comes at the cost of optimization stability. To address this, we introduce two mechanisms: (a) equipping neurons with an intrinsic state inherited upon neurogenesis; (b) lateral inhibition, a mechanism inspired by biological growth, which controlls the pace of growth, helping diversity persist. We show that both mechanisms contribute to neuronal diversity and that, equipped with them, NDPs achieve comparable results to existing direct and developmental encodings in complex locomotion tasks
ROMar 16, 2025
Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired RobotsBinggwong Leung, Worasuchad Haomachai, Joachim Winther Pedersen et al.
Artificial neural networks can be used to solve a variety of robotic tasks. However, they risk failing catastrophically when faced with out-of-distribution (OOD) situations. Several approaches have employed a type of synaptic plasticity known as Hebbian learning that can dynamically adjust weights based on local neural activities. Research has shown that synaptic plasticity can make policies more robust and help them adapt to unforeseen changes in the environment. However, networks augmented with Hebbian learning can lead to weight divergence, resulting in network instability. Furthermore, such Hebbian networks have not yet been applied to solve legged locomotion in complex real robots with many degrees of freedom. In this work, we improve the Hebbian network with a weight normalization mechanism for preventing weight divergence, analyze the principal components of the Hebbian's weights, and perform a thorough evaluation of network performance in locomotion control for real 18-DOF dung beetle-like and 16-DOF gecko-like robots. We find that the Hebbian-based plastic network can execute zero-shot sim-to-real adaptation locomotion and generalize to unseen conditions, such as uneven terrain and morphological damage.
HCDec 30, 2024
Human-like Bots for Tactical Shooters Using Compute-Efficient SensorsNiels Justesen, Maria Kaselimi, Sam Snodgrass et al.
Artificial intelligence (AI) has enabled agents to master complex video games, from first-person shooters like Counter-Strike to real-time strategy games such as StarCraft II and racing games like Gran Turismo. While these achievements are notable, applying these AI methods in commercial video game production remains challenging due to computational constraints. In commercial scenarios, the majority of computational resources are allocated to 3D rendering, leaving limited capacity for AI methods, which often demand high computational power, particularly those relying on pixel-based sensors. Moreover, the gaming industry prioritizes creating human-like behavior in AI agents to enhance player experience, unlike academic models that focus on maximizing game performance. This paper introduces a novel methodology for training neural networks via imitation learning to play a complex, commercial-standard, VALORANT-like 2v2 tactical shooter game, requiring only modest CPU hardware during inference. Our approach leverages an innovative, pixel-free perception architecture using a small set of ray-cast sensors, which capture essential spatial information efficiently. These sensors allow AI to perform competently without the computational overhead of traditional methods. Models are trained to mimic human behavior using supervised learning on human trajectory data, resulting in realistic and engaging AI agents. Human evaluation tests confirm that our AI agents provide human-like gameplay experiences while operating efficiently under computational constraints. This offers a significant advancement in AI model development for tactical shooter games and possibly other genres.
AOJun 13, 2025
Solving Inverse Problems in Stochastic Self-Organising Systems through Invariant RepresentationsElias Najarro, Nicolas Bessone, Sebastian Risi
Self-organising systems demonstrate how simple local rules can generate complex stochastic patterns. Many natural systems rely on such dynamics, making self-organisation central to understanding natural complexity. A fundamental challenge in modelling such systems is solving the inverse problem: finding the unknown causal parameters from macroscopic observations. This task becomes particularly difficult when observations have a strong stochastic component, yielding diverse yet equivalent patterns. Traditional inverse methods fail in this setting, as pixel-wise metrics cannot capture feature similarities between variable outcomes. In this work, we introduce a novel inverse modelling method specifically designed to handle stochasticity in the observable space, leveraging the capacity of visual embeddings to produce robust representations that capture perceptual invariances. By mapping the pattern representations onto an invariant embedding space, we can effectively recover unknown causal parameters without the need for handcrafted objective functions or heuristics. We evaluate the method on two canonical models--a reaction-diffusion system and an agent-based model of social segregation--and show that it reliably recovers parameters despite stochasticity in the outcomes. We further apply the method to real biological patterns, highlighting its potential as a tool for both theorists and experimentalists to investigate the dynamics underlying complex stochastic pattern formation.
LGMay 28, 2025
When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?Eleni Nisioti, Joachim Winther Pedersen, Erwan Plantec et al.
The ability to continuously and efficiently transfer skills across tasks is a hallmark of biological intelligence and a long-standing goal in artificial systems. Reinforcement learning (RL), a dominant paradigm for learning in high-dimensional control tasks, is known to suffer from brittleness to task variations and catastrophic forgetting. Neuroevolution (NE) has recently gained attention for its robustness, scalability, and capacity to escape local optima. In this paper, we investigate an understudied dimension of NE: its transfer learning capabilities. To this end, we introduce two benchmarks: a) in stepping gates, neural networks are tasked with emulating logic circuits, with designs that emphasize modular repetition and variation b) ecorobot extends the Brax physics engine with objects such as walls and obstacles and the ability to easily switch between different robotic morphologies. Crucial in both benchmarks is the presence of a curriculum that enables evaluating skill transfer across tasks of increasing complexity. Our empirical analysis shows that NE methods vary in their transfer abilities and frequently outperform RL baselines. Our findings support the potential of NE as a foundation for building more adaptable agents and highlight future challenges for scaling NE to complex, real-world problems.
NEJun 14, 2024
From Text to Life: On the Reciprocal Relationship between Artificial Life and Large Language ModelsEleni Nisioti, Claire Glanois, Elias Najarro et al.
Large Language Models (LLMs) have taken the field of AI by storm, but their adoption in the field of Artificial Life (ALife) has been, so far, relatively reserved. In this work we investigate the potential synergies between LLMs and ALife, drawing on a large body of research in the two fields. We explore the potential of LLMs as tools for ALife research, for example, as operators for evolutionary computation or the generation of open-ended environments. Reciprocally, principles of ALife, such as self-organization, collective intelligence and evolvability can provide an opportunity for shaping the development and functionalities of LLMs, leading to more adaptive and responsive models. By investigating this dynamic interplay, the paper aims to inspire innovative crossover approaches for both ALife and LLM research. Along the way, we examine the extent to which LLMs appear to increasingly exhibit properties such as emergence or collective intelligence, expanding beyond their original goal of generating text, and potentially redefining our perception of lifelike intelligence in artificial systems.
NEJun 14, 2024
Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent LearningErwan Plantec, Joachin W. Pedersen, Milton L. Montero et al.
Biological neural networks are characterized by their high degree of plasticity, a core property that enables the remarkable adaptability of natural organisms. Importantly, this ability affects both the synaptic strength and the topology of the nervous systems. Artificial neural networks, on the other hand, have been mainly designed as static, fully connected structures that can be notoriously brittle in the face of changing environments and novel inputs. Building on previous works on Neural Developmental Programs (NDPs), we propose a class of self-organizing neural networks capable of synaptic and structural plasticity in an activity and reward-dependent manner which we call Lifelong Neural Developmental Program (LNDP). We present an instance of such a network built on the graph transformer architecture and propose a mechanism for pre-experience plasticity based on the spontaneous activity of sensory neurons. Our results demonstrate the ability of the model to learn from experiences in different control tasks starting from randomly connected or empty networks. We further show that structural plasticity is advantageous in environments necessitating fast adaptation or with non-stationary rewards.
NEMay 25, 2023
Learning to Act through Evolution of Neural Diversity in Random Neural NetworksJoachim Winther Pedersen, Sebastian Risi
Biological nervous systems consist of networks of diverse, sophisticated information processors in the form of neurons of different classes. In most artificial neural networks (ANNs), neural computation is abstracted to an activation function that is usually shared between all neurons within a layer or even the whole network; training of ANNs focuses on synaptic optimization. In this paper, we propose the optimization of neuro-centric parameters to attain a set of diverse neurons that can perform complex computations. Demonstrating the promise of the approach, we show that evolving neural parameters alone allows agents to solve various reinforcement learning tasks without optimizing any synaptic weights. While not aiming to be an accurate biological model, parameterizing neurons to a larger degree than the current common practice, allows us to ask questions about the computational abilities afforded by neural diversity in random neural networks. The presented results open up interesting future research directions, such as combining evolved neural diversity with activity-dependent plasticity.
NEJan 28, 2022
Variational Neural Cellular AutomataRasmus Berg Palm, Miguel González-Duque, Shyam Sudhakaran et al.
In nature, the process of cellular growth and differentiation has lead to an amazing diversity of organisms -- algae, starfish, giant sequoia, tardigrades, and orcas are all created by the same generative process. Inspired by the incredible diversity of this biological generative process, we propose a generative model, the Variational Neural Cellular Automata (VNCA), which is loosely inspired by the biological processes of cellular growth and differentiation. Unlike previous related works, the VNCA is a proper probabilistic generative model, and we evaluate it according to best practices. We find that the VNCA learns to reconstruct samples well and that despite its relatively few parameters and simple local-only communication, the VNCA can learn to generate a large variety of output from information encoded in a common vector format. While there is a significant gap to the current state-of-the-art in terms of generative modeling performance, we show that the VNCA can learn a purely self-organizing generative process of data. Additionally, we show that the VNCA can learn a distribution of stable attractors that can recover from significant damage.
LGJul 14, 2021
Safer Reinforcement Learning through Transferable Instinct NetworksDjordje Grbic, Sebastian Risi
Random exploration is one of the main mechanisms through which reinforcement learning (RL) finds well-performing policies. However, it can lead to undesirable or catastrophic outcomes when learning online in safety-critical environments. In fact, safe learning is one of the major obstacles towards real-world agents that can learn during deployment. One way of ensuring that agents respect hard limitations is to explicitly configure boundaries in which they can operate. While this might work in some cases, we do not always have clear a-priori information which states and actions can lead dangerously close to hazardous states. Here, we present an approach where an additional policy can override the main policy and offer a safer alternative action. In our instinct-regulated RL (IR^2L) approach, an "instinctual" network is trained to recognize undesirable situations, while guarding the learning policy against entering them. The instinct network is pre-trained on a single task where it is safe to make mistakes, and transferred to environments in which learning a new task safely is critical. We demonstrate IR^2L in the OpenAI Safety gym domain, in which it receives a significantly lower number of safety violations during training than a baseline RL approach while reaching similar task performance.
LGJul 5, 2021
Dealing with Adversarial Player Strategies in the Neural Network Game iNNk through Ensemble LearningMathias Löwe, Jennifer Villareale, Evan Freed et al.
Applying neural network (NN) methods in games can lead to various new and exciting game dynamics not previously possible. However, they also lead to new challenges such as the lack of large, clean datasets, varying player skill levels, and changing gameplay strategies. In this paper, we focus on the adversarial player strategy aspect in the game iNNk, in which players try to communicate secret code words through drawings with the goal of not being deciphered by a NN. Some strategies exploit weaknesses in the NN that consistently trick it into making incorrect classifications, leading to unbalanced gameplay. We present a method that combines transfer learning and ensemble methods to obtain a data-efficient adaptation to these strategies. This combination significantly outperforms the baseline NN across all adversarial player strategies despite only being trained on a limited set of adversarial examples. We expect the methods developed in this paper to be useful for the rapidly growing field of NN-based games, which will require new approaches to deal with unforeseen player creativity.
NEMay 27, 2021
Hybrid Encoding For Generating Large Scale Game Level Patterns With Local VariationsJacob Schrum, Benjamin Capps, Kirby Steckel et al.
Generative Adversarial Networks (GANs) are a powerful indirect genotype-to-phenotype mapping for evolutionary search. Much previous work applying GANs to level generation focuses on fixed-size segments combined into a whole level, but individual segments may not fit together cohesively. In contrast, segments in human designed levels are often repeated, directly or with variation, and organized into patterns (the symmetric eagle in Level 1 of The Legend of Zelda, or repeated pipe motifs in Super Mario Bros). Such patterns can be produced with Compositional Pattern Producing Networks (CPPNs). CPPNs define latent vector GAN inputs as a function of geometry, organizing segments output by a GAN into complete levels. However, collections of latent vectors can also be evolved directly, producing more chaotic levels. We propose a hybrid approach that evolves CPPNs first, but allows latent vectors to evolve later, combining the benefits of both approaches. These approaches are evaluated in Super Mario Bros. and The Legend of Zelda. We previously demonstrated via a Quality-Diversity algorithm that CPPNs better cover the space of possible levels than directly evolved levels. Here, we show that the hybrid approach (1) covers areas that neither of the other methods can, and (2) achieves comparable or superior QD scores.
AIMay 18, 2021
Fast Game Content Adaptation Through Bayesian-based Player ModellingMiguel González-Duque, Rasmus Berg Palm, Sebastian Risi
In games, as well as many user-facing systems, adapting content to users' preferences and experience is an important challenge. This paper explores a novel method to realize this goal in the context of dynamic difficulty adjustment (DDA). Here the aim is to constantly adapt the content of a game to the skill level of the player, keeping them engaged by avoiding states that are either too difficult or too easy. Current systems for DDA rely on expensive data mining, or on hand-crafted rules designed for particular domains, and usually adapts to keep players in the flow, leaving no room for the designer to present content that is purposefully easy or difficult. This paper presents Fast Bayesian Content Adaption (FBCA), a system for DDA that is agnostic to the domain and that can target particular difficulties. We deploy this framework in two different domains: the puzzle game Sudoku, and a simple Roguelike game. By modifying the acquisition function's optimization, we are reliably able to present a content with a bespoke difficulty for players with different skill levels in less than five iterations for Sudoku and fifteen iterations for the simple Roguelike. Our method significantly outperforms simpler DDA heuristics with the added benefit of maintaining a model of the user. These results point towards a promising alternative for content adaption in a variety of different domains.
NEApr 16, 2021
Evolving and Merging Hebbian Learning Rules: Increasing Generalization by Decreasing the Number of RulesJoachim Winther Pedersen, Sebastian Risi
Generalization to out-of-distribution (OOD) circumstances after training remains a challenge for artificial agents. To improve the robustness displayed by plastic Hebbian neural networks, we evolve a set of Hebbian learning rules, where multiple connections are assigned to a single rule. Inspired by the biological phenomenon of the genomic bottleneck, we show that by allowing multiple connections in the network to share the same local learning rule, it is possible to drastically reduce the number of trainable parameters, while obtaining a more robust agent. During evolution, by iteratively using simple K-Means clustering to combine rules, our Evolve and Merge approach is able to reduce the number of trainable parameters from 61,440 to 1,920, while at the same time improving robustness, all without increasing the number of generations used. While optimization of the agents is done on a standard quadruped robot morphology, we evaluate the agents' performances on slight morphology modifications in a total of 30 unseen morphologies. Our results add to the discussion on generalization, overfitting and OOD adaptation. To create agents that can adapt to a wider array of unexpected situations, Hebbian learning combined with a regularising "genomic bottleneck" could be a promising research direction.
LGMar 29, 2021
Rapid Risk Minimization with Bayesian Models Through Deep Learning ApproximationMathias Löwe, Per Lunnemann Hansen, Sebastian Risi
We introduce a novel combination of Bayesian Models (BMs) and Neural Networks (NNs) for making predictions with a minimum expected risk. Our approach combines the best of both worlds, the data efficiency and interpretability of a BM with the speed of a NN. For a BM, making predictions with the lowest expected loss requires integrating over the posterior distribution. When exact inference of the posterior predictive distribution is intractable, approximation methods are typically applied, e.g. Monte Carlo (MC) simulation. For MC, the variance of the estimator decreases with the number of samples - but at the expense of increased computational cost. Our approach removes the need for iterative MC simulation on the CPU at prediction time. In brief, it works by fitting a NN to synthetic data generated using the BM. In a single feed-forward pass, the NN gives a set of point-wise approximations to the BM's posterior predictive distribution for a given observation. We achieve risk minimized predictions significantly faster than standard methods with a negligible loss on the test dataset. We combine this approach with Active Learning to minimize the amount of data required for fitting the NN. This is done by iteratively labeling more data in regions with high predictive uncertainty of the NN.
CVFeb 12, 2021
Improving Object Detection in Art Images Using Only Style TransferDavid Kadish, Sebastian Risi, Anders Sundnes Løvlie
Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generate a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer. This dataset is used to fine-tune a Faster R-CNN object detection network, which is then tested on the existing People-Art testing dataset. The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.
HCJan 15, 2021
Player-AI Interaction: What Neural Network Games Reveal About AI as PlayJichen Zhu, Jennifer Villareale, Nithesh Javvaji et al.
The advent of artificial intelligence (AI) and machine learning (ML) bring human-AI interaction to the forefront of HCI research. This paper argues that games are an ideal domain for studying and experimenting with how humans interact with AI. Through a systematic survey of neural network games (n = 38), we identified the dominant interaction metaphors and AI interaction patterns in these games. In addition, we applied existing human-AI interaction guidelines to further shed light on player-AI interaction in the context of AI-infused systems. Our core finding is that AI as play can expand current notions of human-AI interaction, which are predominantly productivity-based. In particular, our work suggests that game and UX designers should consider flow to structure the learning curve of human-AI interaction, incorporate discovery-based learning to play around with the AI and observe the consequences, and offer users an invitation to play to explore new forms of human-AI interaction.
LGNov 23, 2020
Evolutionary Planning in Latent SpaceThor V. A. N. Olesen, Dennis T. T. Nguyen, Rasmus Berg Palm et al.
Planning is a powerful approach to reinforcement learning with several desirable properties. However, it requires a model of the world, which is not readily available in many real-life problems. In this paper, we propose to learn a world model that enables Evolutionary Planning in Latent Space (EPLS). We use a Variational Auto Encoder (VAE) to learn a compressed latent representation of individual observations and extend a Mixture Density Recurrent Neural Network (MDRNN) to learn a stochastic, multi-modal forward model of the world that can be used for planning. We use the Random Mutation Hill Climbing (RMHC) to find a sequence of actions that maximize expected reward in this learned model of the world. We demonstrate how to build a model of the world by bootstrapping it with rollouts from a random policy and iteratively refining it with rollouts from an increasingly accurate planning policy using the learned world model. After a few iterations of this refinement, our planning agents are better than standard model-free reinforcement learning approaches demonstrating the viability of our approach.
NENov 13, 2020
Testing the Genomic Bottleneck Hypothesis in Hebbian Meta-LearningRasmus Berg Palm, Elias Najarro, Sebastian Risi
Hebbian meta-learning has recently shown promise to solve hard reinforcement learning problems, allowing agents to adapt to some degree to changes in the environment. However, because each synapse in these approaches can learn a very specific learning rule, the ability to generalize to very different situations is likely reduced. We hypothesize that limiting the number of Hebbian learning rules through a "genomic bottleneck" can act as a regularizer leading to better generalization across changes to the environment. We test this hypothesis by decoupling the number of Hebbian learning rules from the number of synapses and systematically varying the number of Hebbian learning rules. The results in this paper suggest that simultaneously learning the Hebbian learning rules and their assignment to synapses is a difficult optimization problem, leading to poor performance in the environments tested. However, parallel research to ours finds that it is indeed possible to reduce the number of learning rules by clustering similar rules together. How to best implement a "genomic bottleneck" algorithm is thus an important research direction that warrants further investigation.
HCOct 15, 2020
The power of pictures: using ML assisted image generation to engage the crowd in complex socioscientific problemsJanet Rafner, Lotte Philipsen, Sebastian Risi et al.
Human-computer image generation using Generative Adversarial Networks (GANs) is becoming a well-established methodology for casual entertainment and open artistic exploration. Here, we take the interaction a step further by weaving in carefully structured design elements to transform the activity of ML-assisted imaged generation into a catalyst for large-scale popular dialogue on complex socioscientific problems such as the United Nations Sustainable Development Goals (SDGs) and as a gateway for public participation in research.