Ahmed Khalifa

AI
h-index69
50papers
1,954citations
Novelty34%
AI Score47

50 Papers

AIAug 26, 2022
Generative Personas That Behave and Experience Like Humans

Matthew Barthet, Ahmed Khalifa, Antonios Liapis et al.

Using artificial intelligence (AI) to automatically test a game remains a critical challenge for the development of richer and more complex game worlds and for the advancement of AI at large. One of the most promising methods for achieving that long-standing goal is the use of generative AI agents, namely procedural personas, that attempt to imitate particular playing behaviors which are represented as rules, rewards, or human demonstrations. All research efforts for building those generative agents, however, have focused solely on playing behavior which is arguably a narrow perspective of what a player actually does in a game. Motivated by this gap in the existing state of the art, in this paper we extend the notion of behavioral procedural personas to cater for player experience, thus examining generative agents that can both behave and experience their game as humans would. For that purpose, we employ the Go-Explore reinforcement learning paradigm for training human-like procedural personas, and we test our method on behavior and experience demonstrations of more than 100 players of a racing game. Our findings suggest that the generated agents exhibit distinctive play styles and experience responses of the human personas they were designed to imitate. Importantly, it also appears that experience, which is tied to playing behavior, can be a highly informative driver for better behavioral exploration.

LGAug 2, 2023
Lode Encoder: AI-constrained co-creativity

Debosmita Bhaumik, Ahmed Khalifa, Julian Togelius

We present Lode Encoder, a gamified mixed-initiative level creation system for the classic platform-puzzle game Lode Runner. The system is built around several autoencoders which are trained on sets of Lode Runner levels. When fed with the user's design, each autoencoder produces a version of that design which is closer in style to the levels that it was trained on. The Lode Encoder interface allows the user to build and edit levels through 'painting' from the suggestions provided by the autoencoders. Crucially, in order to encourage designers to explore new possibilities, the system does not include more traditional editing tools. We report on the system design and training procedure, as well as on the evolution of the system itself and user tests.

AIJun 11, 2022
Mutation Models: Learning to Generate Levels by Imitating Evolution

Ahmed Khalifa, Michael Cerny Green, Julian Togelius

Search-based procedural content generation (PCG) is a well-known method for level generation in games. Its key advantage is that it is generic and able to satisfy functional constraints. However, due to the heavy computational costs to run these algorithms online, search-based PCG is rarely utilized for real-time generation. In this paper, we introduce mutation models, a new type of iterative level generator based on machine learning. We train a model to imitate the evolutionary process and use the trained model to generate levels. This trained model is able to modify noisy levels sequentially to create better levels without the need for a fitness function during inference. We evaluate our trained models on a 2D maze generation task. We compare several different versions of the method: training the models either at the end of evolution (normal evolution) or every 100 generations (assisted evolution) and using the model as a mutation function during evolution. Using the assisted evolution process, the final trained models are able to generate mazes with a success rate of 99% and high diversity of 86%. The trained model is many times faster than the evolutionary process it was trained on. This work opens the door to a new way of learning level generators guided by an evolutionary process, meaning automatic creation of generators with specifiable constraints and objectives that are fast enough for runtime deployment in games.

IRAug 16, 2023
A Preliminary Study on a Conceptual Game Feature Generation and Recommendation System

M Charity, Yash Bhartia, Daniel Zhang et al.

This paper introduces a system used to generate game feature suggestions based on a text prompt. Trained on the game descriptions of almost 60k games, it uses the word embeddings of a small GLoVe model to extract features and entities found in thematically similar games which are then passed through a generator model to generate new features for a user's prompt. We perform a short user study comparing the features generated from a fine-tuned GPT-2 model, a model using the ConceptNet, and human-authored game features. Although human suggestions won the overall majority of votes, the GPT-2 model outperformed the human suggestions in certain games. This system is part of a larger game design assistant tool that is able to collaborate with users at a conceptual level.

LGAug 26, 2022
Play with Emotion: Affect-Driven Reinforcement Learning

Matthew Barthet, Ahmed Khalifa, Antonios Liapis et al.

This paper introduces a paradigm shift by viewing the task of affect modeling as a reinforcement learning (RL) process. According to the proposed paradigm, RL agents learn a policy (i.e. affective interaction) by attempting to maximize a set of rewards (i.e. behavioral and affective patterns) via their experience with their environment (i.e. context). Our hypothesis is that RL is an effective paradigm for interweaving affect elicitation and manifestation with behavioral and affective demonstrations. Importantly, our second hypothesis-building on Damasio's somatic marker hypothesis-is that emotion can be the facilitator of decision-making. We test our hypotheses in a racing game by training Go-Blend agents to model human demonstrations of arousal and behavior; Go-Blend is a modified version of the Go-Explore algorithm which has recently showcased supreme performance in hard exploration tasks. We first vary the arousal-based reward function and observe agents that can effectively display a palette of affect and behavioral patterns according to the specified reward. Then we use arousal-based state selection mechanisms in order to bias the strategies that Go-Blend explores. Our findings suggest that Go-Blend not only is an efficient affect modeling paradigm but, more importantly, affect-driven RL improves exploration and yields higher performing agents, validating Damasio's hypothesis in the domain of games.

AIMar 24, 2022
Predicting Personas Using Mechanic Frequencies and Game State Traces

Michael Cerny Green, Ahmed Khalifa, M Charity et al.

We investigate how to efficiently predict play personas based on playtraces. Play personas can be computed by calculating the action agreement ratio between a player and a generative model of playing behavior, a so-called procedural persona. But this is computationally expensive and assumes that appropriate procedural personas are readily available. We present two methods for estimating player persona, one using regular supervised learning and aggregate measures of game mechanics initiated, and another based on sequence learning on a trace of closely cropped gameplay observations. While both of these methods achieve high accuracy when predicting play personas defined by agreement with procedural personas, they utterly fail to predict play style as defined by the players themselves using a questionnaire. This interesting result highlights the value of using computational methods in defining play personas.

LGAug 3, 2023
Lode Enhancer: Level Co-creation Through Scaling

Debosmita Bhaumik, Julian Togelius, Georgios N. Yannakakis et al.

We explore AI-powered upscaling as a design assistance tool in the context of creating 2D game levels. Deep neural networks are used to upscale artificially downscaled patches of levels from the puzzle platformer game Lode Runner. The trained networks are incorporated into a web-based editor, where the user can create and edit levels at three different levels of resolution: 4x4, 8x8, and 16x16. An edit at any resolution instantly transfers to the other resolutions. As upscaling requires inventing features that might not be present at lower resolutions, we train neural networks to reproduce these features. We introduce a neural network architecture that is capable of not only learning upscaling but also giving higher priority to less frequent tiles. To investigate the potential of this tool and guide further development, we conduct a qualitative study with 3 designers to understand how they use it. Designers enjoyed co-designing with the tool, liked its underlying concept, and provided feedback for further improvement.

AIApr 11, 2022
Persona-driven Dominant/Submissive Map (PDSM) Generation for Tutorials

Michael Cerny Green, Ahmed Khalifa, M Charity et al.

In this paper, we present a method for automated persona-driven video game tutorial level generation. Tutorial levels are scenarios in which the player can explore and discover different rules and game mechanics. Procedural personas can guide generators to create content which encourages or discourages certain playstyle behaviors. In this system, we use procedural personas to calculate the behavioral characteristics of levels which are evolved using the quality-diversity algorithm known as Constrained MAP-Elites. An evolved map's quality is determined by its simplicity: the simpler it is, the better it is. Within this work, we show that the generated maps can strongly encourage or discourage different persona-like behaviors and range from simple solutions to complex puzzle-levels, making them perfect candidates for a tutorial generative system.

AIJul 25, 2024
Affectively Framework: Towards Human-like Affect-Based Agents

Matthew Barthet, Roberto Gallotta, Ahmed Khalifa et al.

Game environments offer a unique opportunity for training virtual agents due to their interactive nature, which provides diverse play traces and affect labels. Despite their potential, no reinforcement learning framework incorporates human affect models as part of their observation space or reward mechanism. To address this, we present the \emph{Affectively Framework}, a set of Open-AI Gym environments that integrate affect as part of the observation space. This paper introduces the framework and its three game environments and provides baseline experiments to validate its effectiveness and potential.

HCJul 23, 2024
Closing the Affective Loop via Experience-Driven Reinforcement Learning Designers

Matthew Barthet, Diogo Branco, Roberto Gallotta et al.

Autonomously tailoring content to a set of predetermined affective patterns has long been considered the holy grail of affect-aware human-computer interaction at large. The experience-driven procedural content generation framework realises this vision by searching for content that elicits a certain experience pattern to a user. In this paper, we propose a novel reinforcement learning (RL) framework for generating affect-tailored content, and we test it in the domain of racing games. Specifically, the experience-driven RL (EDRL) framework is given a target arousal trace, and it then generates a racetrack that elicits the desired affective responses for a particular type of player. EDRL leverages a reward function that assesses the affective pattern of any generated racetrack from a corpus of arousal traces. Our findings suggest that EDRL can accurately generate affect-driven racing game levels according to a designer's style and outperforms search-based methods for personalised content generation. The method is not only directly applicable to game content generation tasks but also employable broadly to any domain that uses content for affective adaptation.

NENov 20, 2023
Evolutionary Machine Learning and Games

Julian Togelius, Ahmed Khalifa, Sam Earle et al.

Evolutionary machine learning (EML) has been applied to games in multiple ways, and for multiple different purposes. Importantly, AI research in games is not only about playing games; it is also about generating game content, modeling players, and many other applications. Many of these applications pose interesting problems for EML. We will structure this chapter on EML for games based on whether evolution is used to augment machine learning (ML) or ML is used to augment evolution. For completeness, we also briefly discuss the usage of ML and evolution separately in games.

AIMay 13
Learning Local Constraints for Reinforcement-Learned Content Generators

Debosmita Bhaumik, Julian Togelius, Georgios N. Yannakakis et al.

Constraint-based game content generators that learn local constraints from existing content, such as Wave Function Collapse (WFC), can generate visually satisfying game levels but face challenges in guaranteeing global properties, such as playability. On the other hand, reinforcement-learning trained generators can guarantee global properties -- because such properties can easily be included in reward functions -- but the results can be visually dissatisfying. In this paper, we explore ways to combine these methods. Specifically, we constrain the action space of a PCGRL generator with constraints learned by WFC, effectively allowing the PCGRL generator to achieve global properties while forced to adhere to local constraints. To better analyze how this hybrid content generation method operates, we vary the number and type of inputs, and we test whether to randomly collapse the starting state and exclude rare patterns. While the method is sensitive to hyperparameter tuning, the best of our trained generators produce visually satisfying and playable puzzle-platform game levels -- such as Lode Runner levels -- with desired global properties.

AIFeb 28, 2018Code
General Video Game AI: a Multi-Track Framework for Evaluating Agents, Games and Content Generation Algorithms

Diego Perez-Liebana, Jialin Liu, Ahmed Khalifa et al.

General Video Game Playing (GVGP) aims at designing an agent that is capable of playing multiple video games with no human intervention. In 2014, The General Video Game AI (GVGAI) competition framework was created and released with the purpose of providing researchers a common open-source and easy to use platform for testing their AI methods with potentially infinity of games created using Video Game Description Language (VGDL). The framework has been expanded into several tracks during the last few years to meet the demand of different research directions. The agents are required either to play multiple unknown games with or without access to game simulations, or to design new game levels or rules. This survey paper presents the VGDL, the GVGAI framework, existing tracks, and reviews the wide use of GVGAI framework in research, education and competitions five years after its birth. A future plan of framework improvements is also described.

AIAug 22, 2025
PuzzleJAX: A Benchmark for Reasoning and Learning

Sam Earle, Graham Todd, Yuchen Li et al.

We introduce PuzzleJAX, a GPU-accelerated puzzle game engine and description language designed to support rapid benchmarking of tree search, reinforcement learning, and LLM reasoning abilities. Unlike existing GPU-accelerated learning environments that provide hard-coded implementations of fixed sets of games, PuzzleJAX allows dynamic compilation of any game expressible in its domain-specific language (DSL). This DSL follows PuzzleScript, which is a popular and accessible online game engine for designing puzzle games. In this paper, we validate in PuzzleJAX several hundred of the thousands of games designed in PuzzleScript by both professional designers and casual creators since its release in 2013, thereby demonstrating PuzzleJAX's coverage of an expansive, expressive, and human-relevant space of tasks. By analyzing the performance of search, learning, and language models on these games, we show that PuzzleJAX can naturally express tasks that are both simple and intuitive to understand, yet often deeply challenging to master, requiring a combination of control, planning, and high-level insight.

AIMar 27, 2025
The Procedural Content Generation Benchmark: An Open-source Testbed for Generative Challenges in Games

Ahmed Khalifa, Roberto Gallotta, Matthew Barthet et al.

This paper introduces the Procedural Content Generation Benchmark for evaluating generative algorithms on different game content creation tasks. The benchmark comes with 12 game-related problems with multiple variants on each problem. Problems vary from creating levels of different kinds to creating rule sets for simple arcade games. Each problem has its own content representation, control parameters, and evaluation metrics for quality, diversity, and controllability. This benchmark is intended as a first step towards a standardized way of comparing generative algorithms. We use the benchmark to score three baseline algorithms: a random generator, an evolution strategy, and a genetic algorithm. Results show that some problems are easier to solve than others, as well as the impact the chosen objective has on quality, diversity, and controllability of the generated artifacts.

AIJun 24, 2025
Evolutionary Level Repair

Debosmita Bhaumik, Julian Togelius, Georgios N. Yannakakis et al.

We address the problem of game level repair, which consists of taking a designed but non-functional game level and making it functional. This might consist of ensuring the completeness of the level, reachability of objects, or other performance characteristics. The repair problem may also be constrained in that it can only make a small number of changes to the level. We investigate search-based solutions to the level repair problem, particularly using evolutionary and quality-diversity algorithms, with good results. This level repair method is applied to levels generated using a machine learning-based procedural content generation (PCGML) method that generates stylistically appropriate but frequently broken levels. This combination of PCGML for generation and search-based methods for repair shows great promise as a hybrid procedural content generation (PCG) method.

AIMay 29, 2023
Controllable Path of Destruction

Matthew Siper, Sam Earle, Zehua Jiang et al.

Path of Destruction (PoD) is a self-supervised method for learning iterative generators. The core idea is to produce a training set by destroying a set of artifacts, and for each destructive step create a training instance based on the corresponding repair action. A generator trained on this dataset can then generate new artifacts by repairing from arbitrary states. The PoD method is very data-efficient in terms of original training examples and well-suited to functional artifacts composed of categorical data, such as game levels and discrete 3D structures. In this paper, we extend the Path of Destruction method to allow designer control over aspects of the generated artifacts. Controllability is introduced by adding conditional inputs to the state-action pairs that make up the repair trajectories. We test the controllable PoD method in a 2D dungeon setting, as well as in the domain of small 3D Lego cars.

LGFeb 21, 2022
Path of Destruction: Learning an Iterative Level Generator Using a Small Dataset

Matthew Siper, Ahmed Khalifa, Julian Togelius

We propose a new procedural content generation method which learns iterative level generators from a dataset of existing levels. The Path of Destruction method, as we call it, views level generation as repair; levels are created by iteratively repairing from a random starting level. The first step is to generate an artificial dataset from the original set of levels by introducing many different sequences of mutations to existing levels. In the generated dataset, features are observations of destroyed levels and targets are the specific actions that repair the mutated tile in the middle of the observations. Using this dataset, a convolutional network is trained to map from observations to their respective appropriate repair actions. The trained network is then used to iteratively produce levels from random starting maps. We demonstrate this method by applying it to generate unique and playable tile-based levels for several 2D games (Zelda, Danger Dave, and Sokoban) and vary key hyperparameters.

LGMay 6, 2021
Learning Controllable Content Generators

Sam Earle, Maria Edwards, Ahmed Khalifa et al.

It has recently been shown that reinforcement learning can be used to train generators capable of producing high-quality game levels, with quality defined in terms of some user-specified heuristic. To ensure that these generators' output is sufficiently diverse (that is, not amounting to the reproduction of a single optimal level configuration), the generation process is constrained such that the initial seed results in some variance in the generator's output. However, this results in a loss of control over the generated content for the human user. We propose to train generators capable of producing controllably diverse output, by making them "goal-aware." To this end, we add conditional inputs representing how close a generator is to some heuristic, and also modify the reward mechanism to incorporate that value. Testing on multiple domains, we show that the resulting level generators are capable of exploring the space of possible levels in a targeted, controllable manner, producing levels of comparable quality as their goal-unaware counterparts, that are diverse along designer-specified dimensions.

AIFeb 20, 2021
Game Mechanic Alignment Theory and Discovery

Michael Cerny Green, Ahmed Khalifa, Philip Bontrager et al.

We present a new concept called Game Mechanic Alignment theory as a way to organize game mechanics through the lens of systemic rewards and agential motivations. By disentangling player and systemic influences, mechanics may be better identified for use in an automated tutorial generation system, which could tailor tutorials for a particular playstyle or player. Within, we apply this theory to several well-known games to demonstrate how designers can benefit from it, we describe a methodology for how to estimate "mechanic alignment", and we apply this methodology on multiple games in the GVGAI framework. We discuss how effectively this estimation captures agential motivations and systemic rewards and how our theory could be used as an alternative way to find mechanics for tutorial generation.

AIOct 9, 2020
Deep Learning for Procedural Content Generation

Jialin Liu, Sam Snodgrass, Ahmed Khalifa et al.

Procedural content generation in video games has a long history. Existing procedural content generation methods, such as search-based, solver-based, rule-based and grammar-based methods have been applied to various content types such as levels, maps, character models, and textures. A research field centered on content generation in games has existed for more than a decade. More recently, deep learning has powered a remarkable range of inventions in content production, which are applicable to games. While some cutting-edge deep learning methods are applied on their own, others are applied in combination with more traditional methods, or in an interactive setting. This article surveys the various deep learning methods that have been applied to generate game content directly or indirectly, discusses deep learning methods that could be used for content generation purposes but are rarely used today, and envisages some limitations and potential future directions of deep learning for procedural content generation.

AIAug 6, 2020
Mixed-Initiative Level Design with RL Brush

Omar Delarosa, Hang Dong, Mindy Ruan et al.

This paper introduces RL Brush, a level-editing tool for tile-based games designed for mixed-initiative co-creation. The tool uses reinforcement-learning-based models to augment manual human level-design through the addition of AI-generated suggestions. Here, we apply RL Brush to designing levels for the classic puzzle game Sokoban. We put the tool online and tested it in 39 different sessions. The results show that users using the AI suggestions stay around longer and their created levels on average are more playable and more complex than without.

AIJul 11, 2020
Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network

Matthew C. Fontaine, Ruilin Liu, Ahmed Khalifa et al.

Generative adversarial networks (GANs) are quickly becoming a ubiquitous approach to procedurally generating video game levels. While GAN generated levels are stylistically similar to human-authored examples, human designers often want to explore the generative design space of GANs to extract interesting levels. However, human designers find latent vectors opaque and would rather explore along dimensions the designer specifies, such as number of enemies or obstacles. We propose using state-of-the-art quality diversity algorithms designed to optimize continuous spaces, i.e. MAP-Elites with a directional variation operator and Covariance Matrix Adaptation MAP-Elites, to efficiently explore the latent space of a GAN to extract levels that vary across a set of specified gameplay measures. In the benchmark domain of Super Mario Bros, we demonstrate how designers may specify gameplay measures to our system and extract high-quality (playable) levels with a diverse range of level mechanics, while still maintaining stylistic similarity to human authored examples. An online user study shows how the different mechanics of the automatically generated levels affect subjective ratings of their perceived difficulty and appearance.

NEMay 17, 2020
Multi-Objective level generator generation with Marahel

Ahmed Khalifa, Julian Togelius

This paper introduces a new system to design constructive level generators by searching the space of constructive level generators defined by Marahel language. We use NSGA-II, a multi-objective optimization algorithm, to search for generators for three different problems (Binary, Zelda, and Sokoban). We restrict the representation to a subset of Marahel language to push the evolution to find more efficient generators. The results show that the generated generators were able to achieve good performance on most of the fitness functions over these three problems. However, on Zelda and Sokoban, they tend to depend on the initial state than modifying the map.

HCMar 31, 2020
Baba is Y'all: Collaborative Mixed-Initiative Level Design

Megan Charity, Ahmed Khalifa, Julian Togelius

We present a collaborative mixed-initiative system for building levels for the puzzle game "Baba is You". Unlike previous mixed-initiative systems, Baba is Y'all is designed for collaborative asynchronous creation by multiple users over the internet. The system includes several AI-assisted features to help designers, including a level evolver and an automated player for playtesting. The level archives catalogues levels according to which mechanics are implemented and not implemented, allowing the system to ask users to design levels with specific combinations of mechanics. We describe the operation of the system and the results of small-scale informal user test, and discuss future development paths for this system as well as for collaborative mixed-initiative systems in general.

AIFeb 11, 2020
Mech-Elites: Illuminating the Mechanic Space of GVGAI

M Charity, Michael Cerny Green, Ahmed Khalifa et al.

This paper introduces a fully automatic method of mechanic illumination for general video game level generation. Using the Constrained MAP-Elites algorithm and the GVG-AI framework, this system generates the simplest tile based levels that contain specific sets of game mechanics and also satisfy playability constraints. We apply this method to illuminate mechanic space for $4$ different games in GVG-AI: Zelda, Solarfox, Plants, and RealPortals.

AIFeb 7, 2020
Mario Level Generation From Mechanics Using Scene Stitching

Michael Cerny Green, Luvneesh Mugrai, Ahmed Khalifa et al.

This paper presents a level generation method for Super Mario by stitching together pre-generated "scenes" that contain specific mechanics, using mechanic-sequences from agent playthroughs as input specifications. Given a sequence of mechanics, our system uses an FI-2Pop algorithm and a corpus of scenes to perform automated level authoring. The system outputs levels that have a similar mechanical sequence to the target mechanic sequence but with a different playthrough experience. We compare our system to a greedy method that selects scenes that maximize the target mechanics. Our system is able to maximize the number of matched mechanics while reducing emergent mechanics using the stitching process compared to the greedy approach.

LGJan 27, 2020
Rotation, Translation, and Cropping for Zero-Shot Generalization

Chang Ye, Ahmed Khalifa, Philip Bontrager et al.

Deep Reinforcement Learning (DRL) has shown impressive performance on domains with visual inputs, in particular various games. However, the agent is usually trained on a fixed environment, e.g. a fixed number of levels. A growing mass of evidence suggests that these trained models fail to generalize to even slight variations of the environments they were trained on. This paper advances the hypothesis that the lack of generalization is partly due to the input representation, and explores how rotation, cropping and translation could increase generality. We show that a cropped, translated and rotated observation can get better generalization on unseen levels of two-dimensional arcade games from the GVGAI framework. The generality of the agents is evaluated on both human-designed and procedurally generated levels.

LGJan 24, 2020
PCGRL: Procedural Content Generation via Reinforcement Learning

Ahmed Khalifa, Philip Bontrager, Sam Earle et al.

We investigate how reinforcement learning can be used to train level-designing agents. This represents a new approach to procedural content generation in games, where level design is framed as a game, and the content generator itself is learned. By seeing the design problem as a sequential task, we can use reinforcement learning to learn how to take the next action so that the expected final level quality is maximized. This approach can be used when few or no examples exist to train from, and the trained generator is very fast. We investigate three different ways of transforming two-dimensional level design problems into Markov decision processes and apply these to three game environments.

NEOct 3, 2019
Bootstrapping Conditional GANs for Video Game Level Generation

Ruben Rodriguez Torrado, Ahmed Khalifa, Michael Cerny Green et al.

Generative Adversarial Networks (GANs) have shown im-pressive results for image generation. However, GANs facechallenges in generating contents with certain types of con-straints, such as game levels. Specifically, it is difficult togenerate levels that have aesthetic appeal and are playable atthe same time. Additionally, because training data usually islimited, it is challenging to generate unique levels with cur-rent GANs. In this paper, we propose a new GAN architec-ture namedConditional Embedding Self-Attention Genera-tive Adversarial Network(CESAGAN) and a new bootstrap-ping training procedure. The CESAGAN is a modification ofthe self-attention GAN that incorporates an embedding fea-ture vector input to condition the training of the discriminatorand generator. This allows the network to model non-localdependency between game objects, and to count objects. Ad-ditionally, to reduce the number of levels necessary to trainthe GAN, we propose a bootstrapping mechanism in whichplayable generated levels are added to the training set. Theresults demonstrate that the new approach does not only gen-erate a larger number of levels that are playable but also gen-erates fewer duplicate levels compared to a standard GAN.

AISep 6, 2019
Automatic Critical Mechanic Discovery Using Playtraces in Video Games

Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros et al.

We present a new method of automatic critical mechanic discovery for video games using a combination of game description parsing and playtrace information. This method is applied to several games within the General Video Game Artificial Intelligence (GVG-AI) framework. In a user study, human-identified mechanics are compared against system-identified critical mechanics to verify alignment between humans and the system. The results of the study demonstrate that the new method is able to match humans with higher consistency than baseline. Our system is further validated by comparing MCTS agents augmented with critical mechanics and vanilla MCTS agents on $4$ games from GVG-AI. Our new playtrace method shows a significant performance improvement over the baseline for all 4 tested games. The proposed method also shows either matched or improved performance over the old method, demonstrating that playtrace information is responsible for more complete critical mechanic discovery.

LGAug 12, 2019
Superstition in the Network: Deep Reinforcement Learning Plays Deceptive Games

Philip Bontrager, Ahmed Khalifa, Damien Anderson et al.

Deep reinforcement learning has learned to play many games well, but failed on others. To better characterize the modes and reasons of failure of deep reinforcement learners, we test the widely used Asynchronous Actor-Critic (A2C) algorithm on four deceptive games, which are specially designed to provide challenges to game-playing agents. These games are implemented in the General Video Game AI framework, which allows us to compare the behavior of reinforcement learning-based agents with planning agents based on tree search. We find that several of these games reliably deceive deep reinforcement learners, and that the resulting behavior highlights the shortcomings of the learning algorithm. The particular ways in which agents fail differ from how planning-based agents fail, further illuminating the character of these algorithms. We propose an initial typology of deceptions which could help us better understand pitfalls and failure modes of (deep) reinforcement learning.

NEJul 9, 2019
Procedural Content Generation through Quality Diversity

Daniele Gravina, Ahmed Khalifa, Antonios Liapis et al.

Quality-diversity (QD) algorithms search for a set of good solutions which cover a space as defined by behavior metrics. This simultaneous focus on quality and diversity with explicit metrics sets QD algorithms apart from standard single- and multi-objective evolutionary algorithms, as well as from diversity preservation approaches such as niching. These properties open up new avenues for artificial intelligence in games, in particular for procedural content generation. Creating multiple systematically varying solutions allows new approaches to creative human-AI interaction as well as adaptivity. In the last few years, a handful of applications of QD to procedural content generation and game playing have been proposed; we discuss these and propose challenges for future work.

AIJun 12, 2019
General Video Game Rule Generation

Ahmed Khalifa, Michael Cerny Green, Diego Perez-Liebana et al.

We introduce the General Video Game Rule Generation problem, and the eponymous software framework which will be used in a new track of the General Video Game AI (GVGAI) competition. The problem is, given a game level as input, to generate the rules of a game that fits that level. This can be seen as the inverse of the General Video Game Level Generation problem. Conceptualizing these two problems as separate helps breaking the very hard problem of generating complete games into smaller, more manageable subproblems. The proposed framework builds on the GVGAI software and thus asks the rule generator for rules defined in the Video Game Description Language. We describe the API, and three different rule generators: a random, a constructive and a search-based generator. Early results indicate that the constructive generator generates playable and somewhat interesting game rules but has a limited expressive range, whereas the search-based generator generates remarkably diverse rulesets, but with an uneven quality.

AIJun 11, 2019
Two-step Constructive Approaches for Dungeon Generation

Michael Cerny Green, Ahmed Khalifa, Athoug Alsoughayer et al.

This paper presents a two-step generative approach for creating dungeons in the rogue-like puzzle game MiniDungeons 2. Generation is split into two steps, initially producing the architectural layout of the level as its walls and floor tiles, and then furnishing it with game objects representing the player's start and goal position, challenges and rewards. Three layout creators and three furnishers are introduced in this paper, which can be combined in different ways in the two-step generative process for producing diverse dungeons levels. Layout creators generate the floors and walls of a level, while furnishers populate it with monsters, traps, and treasures. We test the generated levels on several expressivity measures, and in simulations with procedural persona agents.

NEMay 15, 2019
ELIMINATION from Design to Analysis

Ahmed Khalifa, Dan Gopstein, Julian Togelius

Elimination is a word puzzle game for browsers and mobile devices, where all levels are generated by a constrained evolutionary algorithm with no human intervention. This paper describes the design of the game and its level generation methods, and analysis of playtraces from almost a thousand users who played the game since its release. The analysis corroborates that the level generator creates a sawtooth-shaped difficulty curve, as intended. The analysis also offers insights into player behavior in this game.

AIApr 18, 2019
Intentional Computational Level Design

Ahmed Khalifa, Michael Cerny Green, Gabriella Barros et al.

The procedural generation of levels and content in video games is a challenging AI problem. Often such generation relies on an intelligent way of evaluating the content being generated so that constraints are satisfied and/or objectives maximized. In this work, we address the problem of creating levels that are not only playable but also revolve around specific mechanics in the game. We use constrained evolutionary algorithms and quality-diversity algorithms to generate small sections of Super Mario Bros levels called scenes, using three different simulation approaches: Limited Agents, Punishing Model, and Mechanics Dimensions. All three approaches are able to create scenes that give opportunity for a player to encounter or use targeted mechanics with different properties. We conclude by discussing the advantages and disadvantages of each approach and compare them to each other.

ROApr 11, 2019
Controller Design and Implementation of a New Quadrotor Manipulation System

Ahmed Khalifa

The previously introduced aerial manipulation systems suffer from either limited end-effector DOF or small payload capacity. In this dissertation, a quadrotor with a 2-DOF manipulator is investigated that has a unique topology to enable the end-effector to track 6-DOF trajectory with the minimum possible number of actuators/links and hence, maximize the payload and/or mission time. The proposed system is designed, modeled, and constructed. An identification process is carried out to find the system parameters. An experimental setup is proposed with a 6-DOF state measurement and estimation scheme. The system feasibility is validated via numerical and experimental results. The inverse kinematics require a solution of complicated algebraic-differential equations. Therefore, an algorithm is developed to get an approximate solution of these equations.

ROApr 10, 2019
Novel Quadrotor Manipulation System

Ahmed Khalifa

This thesis introduces a novel quadrotor manipulation system that consists of 2-link manipulator attached to the bottom of a quadrotor. This new system presents a solution for the drawbacks found in the current quadrotor manipulation system which uses a gripper fixed to a quadrotor. Unlike the current system, the proposed system has a 6-DOF, and it provides enough distance between the quadrotor and the object. System kinematics and dynamics are derived. To study the feasibility of the proposed system, a quadrotor with high enough payload to add the 2-link manipulator is constructed. Its parameters are identified to be used in the simulation and controller design. A CAD model is developed to calculate the mass and moments of inertia in an accurate way. Direct relationships between Pulse Width Modulation and each of the angular speeds, thrust forces, and drag moments of the rotors are identified. A Direction Cosine Matrix complementary filter is used to estimate the attitude of the quadrotor using the IMU measurements. Attitude stabilization controller is designed based on feedback linearization technique to test the identified parameters and the attitude estimation. The results of the experiments show satisfactory accuracy of the identified structure parameters, the identified rotor assembly parameters, and the attitude estimation algorithm. A controller for the proposed system is designed based on three control techniques: feedback linearization based PID control, direct fuzzy logic control, and fuzzy model reference learning control. These controllers are tested to provide system stability and trajectory tracking under the effect of picking and placing a payload and the effect of changing the operating region. Simulation results show that the fuzzy model reference learning control technique has superior performance. The results indicate the feasibility of the proposed system.

ROMar 29, 2019
Quadrotor Manipulation System: Development of a Robust Contact Force Estimation and Impedance Control Scheme Based on DOb and FTRLS

Ahmed Khalifa, Mohamed Fanni, Alaa Khalifa

The research on aerial manipulation systems has been increased rapidly in recent years. These systems are very attractive for a wide range of applications due to their unique features. However, dynamics, control and manipulation tasks of such systems are quite challenging because they are naturally unstable, have very fast dynamics, have strong nonlinearities, are very susceptible to parameters variations due to carrying a payload besides the external disturbances, and have complex inverse kinematics. In addition, the manipulation tasks require estimating (applying) a certain force of (at) the end-effector as well as the accurate positioning of it. Thus, in this article, a robust force estimation and impedance control scheme is proposed to address these issues. The robustness is achieved based on the Disturbance Observer (DOb) technique. Then, a tracking and performance low computational linear controller is used. For teleoperation purpose, the contact force needs to be identified. However, the current developed techniques for force estimation have limitations because they are based on ignoring some dynamics and/or requiring of an indicator of the environment contact. Unlike these techniques, we propose a technique based on linearization capabilities of DOb and a Fast Tracking Recursive Least Squares (FTRLS) algorithm. The complex inverse kinematics problem of such a system is solved by a Jacobin based algorithm. The stability analysis of the proposed scheme is presented. The algorithm is tested to achieve tracking of task space reference trajectories besides the impedance control. The efficiency of the proposed technique is enlightened via numerical simulation.

ROMar 28, 2019
Inverse Kinematics, Identification, RIC-based Control, and implementation of an Aerial Manipulator

Ahmed Khalifa, Mohamed Fanni

This paper presents the inverse kinematic analysis and parameters identification of a novel aerial manipulation system. This system consists of 2-link manipulator attached to the bottom of a quadrotor. This new system presents a solution for the limitations found in the current quadrotor manipulation system. By deriving the inverse kinematics, one can design the controller such that the desired end effector position and orientation can be tracked. To study the feasibility of the proposed system, a quadrotor with high enough payload to add the 2-link manipulator is designed and constructed. Experimental setup of the system is introduced with an experiment to estimate the rotors parameters. Its parameters are identified to be used in the simulation and controller design of the proposed system. System dynamics are derived briefly based on Newton Euler Method. The controller of the proposed system is designed based on Robust Internal-loop Compensator (RIC) and compared to Fuzzy Model Reference Learning Control (FMRLC) technique which was previously designed and tested for the proposed system. These controllers are tested for provide system stability and trajectory tracking under the effect of picking as well as placing a payload and under the effect of changing the operating region. Simulation framework is implemented in MATLAB/SIMULINK environment. The simulation results indicate the effectiveness of the inverse kinematic analysis and the proposed control technique.

AIMar 27, 2019
Tree Search vs Optimization Approaches for Map Generation

Debosmita Bhaumik, Ahmed Khalifa, Michael Cerny Green et al.

Search-based procedural content generation uses stochastic global optimization algorithms to search for game content. However, standard tree search algorithms can be competitive with evolution on some optimization problems. We investigate the applicability of several tree search methods to level generation and compare them systematically with several optimization algorithms, including evolutionary algorithms. We compare them on three different game level generation problems: Binary, Zelda, and Sokoban. We introduce two new representations that can help tree search algorithms deal with the large branching factor of the generation problem. We find that in general, optimization algorithms clearly outperform tree search algorithms, but given the right problem representation certain tree search algorithms perform similarly to optimization algorithms, and in one particular problem, we see surprisingly strong results from MCTS.

AIFeb 4, 2019
Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning

Arthur Juliani, Ahmed Khalifa, Vincent-Pierre Berges et al.

The rapid pace of recent research in AI has been driven in part by the presence of fast and challenging simulation environments. These environments often take the form of games; with tasks ranging from simple board games, to competitive video games. We propose a new benchmark - Obstacle Tower: a high fidelity, 3D, 3rd person, procedurally generated environment. An agent playing Obstacle Tower must learn to solve both low-level control and high-level planning problems in tandem while learning from pixels and a sparse reward signal. Unlike other benchmarks such as the Arcade Learning Environment, evaluation of agent performance in Obstacle Tower is based on an agent's ability to perform well on unseen instances of the environment. In this paper we outline the environment and provide a set of baseline results produced by current state-of-the-art Deep RL methods as well as human players. These algorithms fail to produce agents capable of performing near human level.

AISep 9, 2018
A Continuous Information Gain Measure to Find the Most Discriminatory Problems for AI Benchmarking

Matthew Stephenson, Damien Anderson, Ahmed Khalifa et al.

This paper introduces an information-theoretic method for selecting a subset of problems which gives the most information about a group of problem-solving algorithms. This method was tested on the games in the General Video Game AI (GVGAI) framework, allowing us to identify a smaller set of games that still gives a large amount of information about the abilities of different game-playing agents. This approach can be used to make agent testing more efficient. We can achieve almost as good discriminatory accuracy when testing on only a handful of games as when testing on more than a hundred games, something which is often computationally infeasible. Furthermore, this method can be extended to study the dimensions of the effective variance in game design between these games, allowing us to identify which games differentiate between agents in the most complementary ways.

AIJul 18, 2018
Generating Levels That Teach Mechanics

Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros et al.

The automatic generation of game tutorials is a challenging AI problem. While it is possible to generate annotations and instructions that explain to the player how the game is played, this paper focuses on generating a gameplay experience that introduces the player to a game mechanic. It evolves small levels for the Mario AI Framework that can only be beaten by an agent that knows how to perform specific actions in the game. It uses variations of a perfect A* agent that are limited in various ways, such as not being able to jump high or see enemies, to test how failing to do certain actions can stop the player from beating the level.

AIJul 11, 2018
AtDelfi: Automatically Designing Legible, Full Instructions For Games

Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros et al.

This paper introduces a fully automatic method for generating video game tutorials. The AtDELFI system (AuTomatically DEsigning Legible, Full Instructions for games) was created to investigate procedural generation of instructions that teach players how to play video games. We present a representation of game rules and mechanics using a graph system as well as a tutorial generation method that uses said graph representation. We demonstrate the concept by testing it on games within the General Video Game Artificial Intelligence (GVG-AI) framework; the paper discusses tutorials generated for eight different games. Our findings suggest that a graph representation scheme works well for simple arcade style games such as Space Invaders and Pacman, but it appears that tutorials for more complex games might require higher-level understanding of the game than just single mechanics.

LGJun 28, 2018
Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation

Niels Justesen, Ruben Rodriguez Torrado, Philip Bontrager et al.

Deep reinforcement learning (RL) has shown impressive results in a variety of domains, learning directly from high-dimensional sensory streams. However, when neural networks are trained in a fixed environment, such as a single level in a video game, they will usually overfit and fail to generalize to new levels. When RL models overfit, even slight modifications to the environment can result in poor agent performance. This paper explores how procedurally generated levels during training can increase generality. We show that for some games procedural level generation enables generalization to new levels within the same distribution. Additionally, it is possible to achieve better performance with less data by manipulating the difficulty of the levels in response to the performance of the agent. The generality of the learned behaviors is also evaluated on a set of human-designed levels. The results suggest that the ability to generalize to human-designed levels highly depends on the design of the level generators. We apply dimensionality reduction and clustering techniques to visualize the generators' distributions of levels and analyze to what degree they can produce levels similar to those designed by a human.

AIJun 12, 2018
Talakat: Bullet Hell Generation through Constrained Map-Elites

Ahmed Khalifa, Scott Lee, Andy Nealen et al.

We describe a search-based approach to generating new levels for bullet hell games, which are action games characterized by and requiring avoidance of a very large amount of projectiles. Levels are represented using a domain-specific description language, and search in the space defined by this language is performed by a novel variant of the Map-Elites algorithm which incorporates a feasible- infeasible approach to constraint satisfaction. Simulation-based evaluation is used to gauge the fitness of levels, using an agent based on best-first search. The performance of the agent can be tuned according to the two dimensions of strategy and dexterity, making it possible to search for level configurations that require a specific combination of both. As far as we know, this paper describes the first generator for this game genre, and includes several algorithmic innovations.

AIMay 30, 2018
"Press Space to Fire": Automatic Video Game Tutorial Generation

Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros et al.

We propose the problem of tutorial generation for games, i.e. to generate tutorials which can teach players to play games, as an AI problem. This problem can be approached in several ways, including generating natural language descriptions of game rules, generating instructive game levels, and generating demonstrations of how to play a game using agents that play in a human-like manner. We further argue that the General Video Game AI framework provides a useful testbed for addressing this problem.

CLMay 9, 2017
DeepTingle

Ahmed Khalifa, Gabriella A. B. Barros, Julian Togelius

DeepTingle is a text prediction and classification system trained on the collected works of the renowned fantastic gay erotica author Chuck Tingle. Whereas the writing assistance tools you use everyday (in the form of predictive text, translation, grammar checking and so on) are trained on generic, purportedly "neutral" datasets, DeepTingle is trained on a very specific, internally consistent but externally arguably eccentric dataset. This allows us to foreground and confront the norms embedded in data-driven creativity and productivity assistance tools. As such tools effectively function as extensions of our cognition into technology, it is important to identify the norms they embed within themselves and, by extension, us. DeepTingle is realized as a web application based on LSTM networks and the GloVe word embedding, implemented in JavaScript with Keras-JS.