Dennis J. N. J. Soemers

h-index28

34papers

353citations

Novelty37%

AI Score49

Ranked #48,433 of 201,326 authors (top 24%)#2,656 in AI (top 19%)

34 Papers

AIJul 3, 2024

Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

Dennis J. N. J. Soemers, Chiara F. Sironi, Torsten Schuster et al.

General Video Game Playing (GVGP) is a field of Artificial Intelligence where agents play a variety of real-time video games that are unknown in advance. This limits the use of domain-specific heuristics. Monte-Carlo Tree Search (MCTS) is a search technique for game playing that does not rely on domain-specific knowledge. This paper discusses eight enhancements for MCTS in GVGP; Progressive History, N-Gram Selection Technique, Tree Reuse, Breadth-First Tree Initialization, Loss Avoidance, Novelty-Based Pruning, Knowledge-Based Evaluations, and Deterministic Game Detection. Some of these are known from existing literature, and are either extended or introduced in the context of GVGP, and some are novel enhancements for MCTS. Most enhancements are shown to provide statistically significant increases in win percentages when applied individually. When combined, they increase the average win percentage over sixty different games from 31.0% to 48.4% in comparison to a vanilla MCTS implementation, approaching a level that is competitive with the best agents of the GVG-AI competition in 2015.

43.6AIMay 20Code

For How Long Should We Be Punching? Learning Action Duration in Fighting Games

Hoang Hai Nguyen, Kurt Driessens, Dennis J. N. J. Soemers

Fighting games such as Street Fighter II present unique challenges to reinforcement learning (RL) agents due to their fast-paced, real-time nature. In most RL frameworks, agents are hard-coded to make decisions at a fixed interval, typically every frame or every N frames. Although this design ensures timely responses, it restricts the agent's ability to adjust its reaction timing. Acting every frame grants frame-perfect reflexes, which are unrealistic compared to human players, whereas longer fixed intervals reduce computational cost but hinder responsiveness. We consider an alternative decision-making framework in which the agent learns not only what action to take but also for how long to execute it. By jointly predicting both action and duration, the agent can dynamically adapt its responsiveness to different situations in the game. We implement this method using the open-source FightLadder environment with agents trained against scripted built-in bots, systematically testing different frame skip configurations to analyze their influence on performance, responsiveness, and learned behavior. Experiments show that learned timing can match the performance of well-chosen fixed frame skips and encourages repeatable action patterns, but does not ensure robustness on its own. In most cases, we see agents performing best with consistently high frame skip values (i.e., low responsiveness). This strategy makes it easier to learn exploitative strategies where the same action is repeated over and over, which the scripted bots appear to be susceptible to.

AIJun 8, 2022

Combining Monte-Carlo Tree Search with Proof-Number Search

Elliot Doe, Mark H. M. Winands, Dennis J. N. J. Soemers et al.

Proof-Number Search (PNS) and Monte-Carlo Tree Search (MCTS) have been successfully applied for decision making in a range of games. This paper proposes a new approach called PN-MCTS that combines these two tree-search methods by incorporating the concept of proof and disproof numbers into the UCT formula of MCTS. Experimental results demonstrate that PN-MCTS outperforms basic MCTS in several games including Lines of Action, MiniShogi, Knightthrough, and Awari, achieving win rates up to 94.0%.

AIJul 12, 2024

GAVEL: Generating Games Via Evolution and Language Models

Graham Todd, Alexander Padula, Matthew Stephenson et al.

Automatically generating novel and interesting games is a complex task. Challenges include representing game rules in a computationally workable form, searching through the large space of potential games under most such representations, and accurately evaluating the originality and quality of previously unseen games. Prior work in automated game generation has largely focused on relatively restricted rule representations and relied on domain-specific heuristics. In this work, we explore the generation of novel games in the comparatively expansive Ludii game description language, which encodes the rules of over 1000 board games in a variety of styles and modes of play. We draw inspiration from recent advances in large language models and evolutionary computation in order to train a model that intelligently mutates and recombines games and mechanics expressed as code. We demonstrate both quantitatively and qualitatively that our approach is capable of generating new and interesting games, including in regions of the potential rules space not covered by existing games in the Ludii dataset. A sample of the generated games are available to play online through the Ludii portal.

AIMar 16, 2023

Proof Number Based Monte-Carlo Tree Search

Jakub Kowalski, Elliot Doe, Mark H. M. Winands et al.

This paper proposes a new game-search algorithm, PN-MCTS, which combines Monte-Carlo Tree Search (MCTS) and Proof-Number Search (PNS). These two algorithms have been successfully applied for decision making in a range of domains. We define three areas where the additional knowledge provided by the proof and disproof numbers gathered in MCTS trees might be used: final move selection, solving subtrees, and the UCB1 selection mechanism. We test all possible combinations on different time settings, playing against vanilla UCT on several games: Lines of Action ($7$$\times$$7$ and $8$$\times$$8$ board sizes), MiniShogi, Knightthrough, and Awari. Furthermore, we extend this new algorithm to properly address games with draws, like Awari, by adding an additional layer of PNS on top of the MCTS tree. The experiments show that PN-MCTS is able to outperform MCTS in all tested game domains, achieving win rates up to 96.2% for Lines of Action.

AIJan 10, 2023

Measuring Board Game Distance

Matthew Stephenson, Dennis J. N. J. Soemers, Éric Piette et al.

This paper presents a general approach for measuring distances between board games within the Ludii general game system. These distances are calculated using a previously published set of general board game concepts, each of which represents a common game idea or shared property. Our results compare and contrast two different measures of distance, highlighting the subjective nature of such metrics and discussing the different ways that they can be interpreted.

AIMay 1, 2022

The Ludii Game Description Language is Universal

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson et al.

There are several different game description languages (GDLs), each intended to allow wide ranges of arbitrary games (i.e., general games) to be described in a single higher-level language than general-purpose programming languages. Games described in such formats can subsequently be presented as challenges for automated general game playing agents, which are expected to be capable of playing any arbitrary game described in such a language without prior knowledge about the games to be played. The language used by the Ludii general game system was previously shown to be capable of representing equivalent games for any arbitrary, finite, deterministic, fully observable extensive-form game. In this paper, we prove its universality by extending this to include finite non-deterministic and imperfect-information games.

AIJun 27, 2025Code

Ludax: A GPU-Accelerated Domain Specific Language for Board Games

Graham Todd, Alexander G. Padula, Dennis J. N. J. Soemers et al.

Games have long been used as benchmarks and testing environments for research in artificial intelligence. A key step in supporting this research was the development of game description languages: frameworks that compile domain-specific code into playable and simulatable game environments, allowing researchers to generalize their algorithms and approaches across multiple games without having to manually implement each one. More recently, progress in reinforcement learning (RL) has been largely driven by advances in hardware acceleration. Libraries like JAX allow practitioners to take full advantage of cutting-edge computing hardware, often speeding up training and testing by orders of magnitude. Here, we present a synthesis of these strands of research: a domain-specific language for board games which automatically compiles into hardware-accelerated code. Our framework, Ludax, combines the generality of game description languages with the speed of modern parallel processing hardware and is designed to fit neatly into existing deep learning pipelines. We envision Ludax as a tool to help accelerate games research generally, from RL to cognitive science, by enabling rapid simulation and providing a flexible representation scheme. We present a detailed breakdown of Ludax's description language and technical notes on the compilation process, along with speed benchmarking and a demonstration of training RL agents. The Ludax framework, along with implementations of existing board games, is open-source and freely available.

10.0AIApr 28

StratFormer: Adaptive Opponent Modeling and Exploitation in Imperfect-Information Games

Andy Caen, Mark H. M. Winands, Dennis J. N. J. Soemers

We present StratFormer, a transformer-based meta-agent that learns to simultaneously model and exploit opponents in imperfect-information games through a two-phase curriculum. The first phase trains an opponent modeling head to identify behavioral patterns from action histories while the agent plays a game-theoretic optimal (GTO) policy. The second phase progressively shifts the policy toward best-response (BR) exploitation, guided by a per-opponent regularization schedule tied to exploitability. Our architecture introduces dual-turn tokens -- feature vectors constructed at both agent and opponent decision points -- coupled with bucket-rate features that encode opponent tendencies across five strategic contexts. On Leduc Hold'em, a small poker variant with six cards and two betting rounds, we test against six opponent archetypes at two strength levels each, with exploitability ranging from 0.15 to 1.26 Big Blinds (BB) per hand. StratFormer achieves an average exploitation gain of +0.106 BB per hand over GTO, with peak gains of +0.821 against highly exploitable opponents, while maintaining near-equilibrium safety.

AIDec 22, 2024

A Research Agenda for Usability and Generalisation in Reinforcement Learning

Dennis J. N. J. Soemers, Spyridon Samothrakis, Kurt Driessens et al.

It is common practice in reinforcement learning (RL) research to train and deploy agents in bespoke simulators, typically implemented by engineers directly in general-purpose programming languages or hardware acceleration frameworks such as CUDA or JAX. This means that programming and engineering expertise is not only required to develop RL algorithms, but is also required to use already developed algorithms for novel problems. The latter poses a problem in terms of the usability of RL, in particular for private individuals and small organisations without substantial engineering expertise. We also perceive this as a challenge for effective generalisation in RL, in the sense that is no standard, shared formalism in which different problems are represented. As we typically have no consistent representation through which to provide information about any novel problem to an agent, our agents also cannot instantly or rapidly generalise to novel problems. In this position paper, we advocate for a research agenda centred around the use of user-friendly description languages for describing problems, such that (i) users with little to no engineering expertise can formally describe the problems they would like to be tackled by RL algorithms, and (ii) algorithms can leverage problem descriptions to effectively generalise among all problems describable in the language of choice.

LGNov 11, 2024

Anytime Sequential Halving in Monte-Carlo Tree Search

Dominic Sagers, Mark H. M. Winands, Dennis J. N. J. Soemers

Monte-Carlo Tree Search (MCTS) typically uses multi-armed bandit (MAB) strategies designed to minimize cumulative regret, such as UCB1, as its selection strategy. However, in the root node of the search tree, it is more sensible to minimize simple regret. Previous work has proposed using Sequential Halving as selection strategy in the root node, as, in theory, it performs better with respect to simple regret. However, Sequential Halving requires a budget of iterations to be predetermined, which is often impractical. This paper proposes an anytime version of the algorithm, which can be halted at any arbitrary time and still return a satisfactory result, while being designed such that it approximates the behavior of Sequential Halving. Empirical results in synthetic MAB problems and ten different board games demonstrate that the algorithm's performance is competitive with Sequential Halving and UCB1 (and their analogues in MCTS).

CLOct 22, 2024

Exploring RL-based LLM Training for Formal Language Tasks with Programmed Rewards

Alexander G. Padula, Dennis J. N. J. Soemers

Proximal Policy Optimization (PPO) is commonly used in Reinforcement Learning from Human Feedback to align large language models (LLMs) with downstream tasks. This paper investigates the feasibility of using PPO for direct reinforcement learning (RL) from explicitly programmed reward signals, as opposed to indirect learning from human feedback via an intermediary reward model. We focus on tasks expressed through formal languages, such as mathematics and programming, where explicit reward functions can be programmed to automatically assess the quality of generated outputs. We apply this approach to a sentiment alignment task, a simple arithmetic task, and a more complex game synthesis task. The sentiment alignment task replicates prior research and serves to validate our experimental setup. Our results show that pure RL-based training for the two formal language tasks is challenging, with success being limited even for the simple arithmetic task. We propose a novel batch-entropy regularization term to aid exploration, although training is not yet entirely stable. Our findings suggest that direct RL training of LLMs may be more suitable for relatively minor changes, such as alignment, than for learning new tasks altogether, even if an informative reward signal can be expressed programmatically.

AIJun 16, 2025

Generalized Proof-Number Monte-Carlo Tree Search

Jakub Kowalski, Dennis J. N. J. Soemers, Szymon Kosakowski et al.

This paper presents Generalized Proof-Number Monte-Carlo Tree Search: a generalization of recently proposed combinations of Proof-Number Search (PNS) with Monte-Carlo Tree Search (MCTS), which use (dis)proof numbers to bias UCB1-based Selection strategies towards parts of the search that are expected to be easily (dis)proven. We propose three core modifications of prior combinations of PNS with MCTS. First, we track proof numbers per player. This reduces code complexity in the sense that we no longer need disproof numbers, and generalizes the technique to be applicable to games with more than two players. Second, we propose and extensively evaluate different methods of using proof numbers to bias the selection strategy, achieving strong performance with strategies that are simpler to implement and compute. Third, we merge our technique with Score Bounded MCTS, enabling the algorithm to prove and leverage upper and lower bounds on scores - as opposed to only proving wins or not-wins. Experiments demonstrate substantial performance increases, reaching the range of 80% for 8 out of the 11 tested board games.

AIJun 26, 2024

Games of Knightian Uncertainty as AGI testbeds

Spyridon Samothrakis, Dennis J. N. J. Soemers, Damian Machlanski

Arguably, for the latter part of the late 20th and early 21st centuries, games have been seen as the drosophila of AI. Games are a set of exciting testbeds, whose solutions (in terms of identifying optimal players) would lead to machines that would possess some form of general intelligence, or at the very least help us gain insights toward building intelligent machines. Following impressive successes in traditional board games like Go, Chess, and Poker, but also video games like the Atari 2600 collection, it is clear that this is not the case. Games have been attacked successfully, but we are nowhere near AGI developments (or, as harsher critics might say, useful AI developments!). In this short vision paper, we argue that for game research to become again relevant to the AGI pathway, we need to be able to address \textit{Knightian uncertainty} in the context of games, i.e. agents need to be able to adapt to rapid changes in game rules on the fly with no warning, no previous data, and no model access.

AIJun 13, 2024

Towards a Characterisation of Monte-Carlo Tree Search Performance in Different Games

Dennis J. N. J. Soemers, Guillaume Bams, Max Persoon et al.

Many enhancements to Monte-Carlo Tree Search (MCTS) have been proposed over almost two decades of general game playing and other artificial intelligence research. However, our ability to characterise and understand which variants work well or poorly in which games is still lacking. This paper describes work on an initial dataset that we have built to make progress towards such an understanding: 268,386 plays among 61 different agents across 1494 distinct games. We describe a preliminary analysis and work on training predictive models on this dataset, as well as lessons learned and future plans for a new and improved version of the dataset.

AIJan 17, 2022

Spatial State-Action Features for General Games

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson et al.

In many board games and other abstract games, patterns have been used as features that can guide automated game-playing agents. Such patterns or features often represent particular configurations of pieces, empty positions, etc., which may be relevant for a game's strategies. Their use has been particularly prevalent in the game of Go, but also many other games used as benchmarks for AI research. In this paper, we formulate a design and efficient implementation of spatial state-action features for general games. These are patterns that can be trained to incentivise or disincentivise actions based on whether or not they match variables of the state in a local area around action variables. We provide extensive details on several design and implementation choices, with a primary focus on achieving a high degree of generality to support a wide variety of different games using different board geometries or other graphs. Secondly, we propose an efficient approach for evaluating active features for any given set of features. In this approach, we take inspiration from heuristics used in problems such as SAT to optimise the order in which parts of patterns are matched and prune unnecessary evaluations. This approach is defined for a highly general and abstract description of the problem -- phrased as optimising the order in which propositions of formulas in disjunctive normal form are evaluated -- and may therefore also be of interest to other types of problems than board games. An empirical evaluation on 33 distinct games in the Ludii general game system demonstrates the efficiency of this approach in comparison to a naive baseline, as well as a baseline based on prefix trees, and demonstrates that the additional efficiency significantly improves the playing strength of agents using the features to guide search.

AINov 22, 2021

General Board Geometry

Cameron Browne, Éric Piette, Matthew Stephenson et al.

Game boards are described in the Ludii general game system by their underlying graphs, based on tiling, shape and graph operators, with the automatic detection of important properties such as topological relationships between graph elements, directions and radial step sequences. This approach allows most conceivable game boards to be described simply and succinctly.

AINov 4, 2021

Optimised Playout Implementations for the Ludii General Game System

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson et al.

This paper describes three different optimised implementations of playouts, as commonly used by game-playing algorithms such as Monte-Carlo Tree Search. Each of the optimised implementations is applicable only to specific sets of games, based on their rules. The Ludii general game system can automatically infer, based on a game's description in its general game description language, whether any optimised implementations are applicable. An empirical evaluation demonstrates major speedups over a standard implementation, with a median result of running playouts 5.08 times as fast, over 145 different games in Ludii for which one of the optimised implementations is applicable.

AISep 20, 2021

Automatic Generation of Board Game Manuals

Matthew Stephenson, Eric Piette, Dennis J. N. J. Soemers et al.

In this paper we present a process for automatically generating manuals for board games within the Ludii general game system. This process requires many different sub-tasks to be addressed, such as English translation of Ludii game descriptions, move visualisation, highlighting winning moves, strategy explanation, among others. These aspects are then combined to create a full manual for any given game. This manual is intended to provide a more intuitive explanation of a game's rules and mechanics, particularly for players who are less familiar with the Ludii game description language and grammar.

AIJul 2, 2021

General Board Game Concepts

Éric Piette, Matthew Stephenson, Dennis J. N. J. Soemers et al.

Many games often share common ideas or aspects between them, such as their rules, controls, or playing area. However, in the context of General Game Playing (GGP) for board games, this area remains under-explored. We propose to formalise the notion of "game concept", inspired by terms generally used by game players and designers. Through the Ludii General Game System, we describe concepts for several levels of abstraction, such as the game itself, the moves played, or the states reached. This new GGP feature associated with the ludeme representation of games opens many new lines of research. The creation of a hyper-agent selector, the transfer of AI learning between games, or explaining AI techniques using game terms, can all be facilitated by the use of game concepts. Other applications which can benefit from game concepts are also discussed, such as the generation of plausible reconstructed rules for incomplete ancient games, or the implementation of a board game recommender system.

AIMay 26, 2021

General Game Heuristic Prediction Based on Ludeme Descriptions

Matthew Stephenson, Dennis J. N. J. Soemers, Eric Piette et al.

This paper investigates the performance of different general-game-playing heuristics for games in the Ludii general game system. Based on these results, we train several regression learning models to predict the performance of these heuristics based on each game's description file. We also provide a condensed analysis of the games available in Ludii, and the different ludemes that define them.

LGFeb 24, 2021

Transfer of Fully Convolutional Policy-Value Networks Between Games and Game Variants

Dennis J. N. J. Soemers, Vegard Mella, Eric Piette et al.

In this paper, we use fully convolutional architectures in AlphaZero-like self-play training setups to facilitate transfer between variants of board games as well as distinct games. We explore how to transfer trained parameters of these architectures based on shared semantics of channels in the state and action representations of the Ludii general game system. We use Ludii's large library of games and game variants for extensive transfer learning evaluations, in zero-shot transfer experiments as well as experiments with additional fine-tuning time.

AIJan 23, 2021

Deep Learning for General Game Playing with Ludii and Polygames

Dennis J. N. J. Soemers, Vegard Mella, Cameron Browne et al.

Combinations of Monte-Carlo tree search and Deep Neural Networks, trained through self-play, have produced state-of-the-art results for automated game-playing in many board games. The training and search algorithms are not game-specific, but every individual game that these approaches are applied to still requires domain knowledge for the implementation of the game's rules, and constructing the neural network's architecture -- in particular the shapes of its input and output tensors. Ludii is a general game system that already contains over 500 different games, which can rapidly grow thanks to its powerful and user-friendly game description language. Polygames is a framework with training and search algorithms, which has already produced superhuman players for several board games. This paper describes the implementation of a bridge between Ludii and Polygames, which enables Polygames to train and evaluate models for games that are implemented and run through Ludii. We do not require any game-specific domain knowledge anymore, and instead leverage our domain knowledge of the Ludii system and its abstract state and move representations to write functions that can automatically determine the appropriate shapes for input and output tensors for any game implemented in Ludii. We describe experimental results for short training runs in a wide variety of different board games, and discuss several open problems and avenues for future research.

AIJan 6, 2021

Ludii Game Logic Guide

Éric Piette, Cameron Browne, Dennis J. N. J. Soemers

This technical report outlines the fundamental workings of the game logic behind Ludii, a general game system, that can be used to play a wide variety of games. Ludii is a program developed for the ERC-funded Digital Ludeme Project, in which mathematical and computational approaches are used to study how games were played, and spread, throughout history. This report explains how general game states and equipment are represented in Ludii, and how the rule ludemes dictating play are implemented behind the scenes, giving some insight into the core game logic behind the Ludii general game player. This guide is intended to help game designers using the Ludii game description language to understand it more completely and make fuller use of its features when describing their games.

AIJan 4, 2021

Strategic Features for General Games

Cameron Browne, Dennis J. N. J. Soemers, Eric Piette

This short paper describes an ongoing research project that requires the automated self-play learning and evaluation of a large number of board games in digital form. We describe the approach we are taking to determine relevant features, for biasing MCTS playouts for arbitrary games played on arbitrary geometries. Benefits of our approach include efficient implementation, the potential to transfer learnt knowledge to new contexts, and the potential to explain strategic knowledge embedded in features in human-comprehensible terms.

LGMay 30, 2020

Manipulating the Distributions of Experience used for Self-Play Learning in Expert Iteration

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson et al.

Expert Iteration (ExIt) is an effective framework for learning game-playing policies from self-play. ExIt involves training a policy to mimic the search behaviour of a tree search algorithm - such as Monte-Carlo tree search - and using the trained policy to guide it. The policy and the tree search can then iteratively improve each other, through experience gathered in self-play between instances of the guided tree search algorithm. This paper outlines three different approaches for manipulating the distribution of data collected from self-play, and the procedure that samples batches for learning updates from the collected data. Firstly, samples in batches are weighted based on the durations of the episodes in which they were originally experienced. Secondly, Prioritized Experience Replay is applied within the ExIt framework, to prioritise sampling experience from which we expect to obtain valuable training signals. Thirdly, a trained exploratory policy is used to diversify the trajectories experienced in self-play. This paper summarises the effects of these manipulations on training performance evaluated in fourteen different board games. We find major improvements in early training performance in some games, and minor improvements averaged over fourteen games.

AIJun 29, 2019

Ludii as a Competition Platform

Matthew Stephenson, Éric Piette, Dennis J. N. J. Soemers et al.

Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). While its primary aim is to model, play, and analyse the full range of traditional strategy games, Ludii also has the potential to support a wide range of AI research topics and competitions. This paper describes some of the future competitions and challenges that we intend to run using the Ludii system, highlighting some of its most important aspects that can potentially lead to many algorithm improvements and new avenues of research. We compare and contrast our proposed competition motivations, goals and frameworks against those of existing general game playing competitions, addressing the strengths and weaknesses of each platform.

AIJun 29, 2019

Ludii and XCSP: Playing and Solving Logic Puzzles

Cédric Piette, Éric Piette, Matthew Stephenson et al.

Many of the famous single-player games, commonly called puzzles, can be shown to be NP-Complete. Indeed, this class of complexity contains hundreds of puzzles, since people particularly appreciate completing an intractable puzzle, such as Sudoku, but also enjoy the ability to check their solution easily once it's done. For this reason, using constraint programming is naturally suited to solve them. In this paper, we focus on logic puzzles described in the Ludii general game system and we propose using the XCSP formalism in order to solve them with any CSP solver.

AIJun 29, 2019

An Empirical Evaluation of Two General Game Systems: Ludii and RBG

Éric Piette, Matthew Stephenson, Dennis J. N. J. Soemers et al.

Although General Game Playing (GGP) systems can facilitate useful research in Artificial Intelligence (AI) for game-playing, they are often computationally inefficient and somewhat specialised to a specific class of games. However, since the start of this year, two General Game Systems have emerged that provide efficient alternatives to the academic state of the art -- the Game Description Language (GDL). In order of publication, these are the Regular Boardgames language (RBG), and the Ludii system. This paper offers an experimental evaluation of Ludii. Here, we focus mainly on a comparison between the two new systems in terms of two key properties for any GGP system: simplicity/clarity (e.g. human-readability), and efficiency.

AIJun 29, 2019

An Overview of the Ludii General Game System

Matthew Stephenson, Éric Piette, Dennis J. N. J. Soemers et al.

The Digital Ludeme Project (DLP) aims to reconstruct and analyse over 1000 traditional strategy games using modern techniques. One of the key aspects of this project is the development of Ludii, a general game system that will be able to model and play the complete range of games required by this project. Such an undertaking will create a wide range of possibilities for new AI challenges. In this paper we describe many of the features of Ludii that can be used. This includes designing and modifying games using the Ludii game description language, creating agents capable of playing these games, and several advantages the system has over prior general game software.

AIMay 31, 2019

Foundations of Digital Archæoludology

Cameron Browne, Dennis J. N. J. Soemers, Éric Piette et al.

Digital Archaeoludology (DAL) is a new field of study involving the analysis and reconstruction of ancient games from incomplete descriptions and archaeological evidence using modern computational techniques. The aim is to provide digital tools and methods to help game historians and other researchers better understand traditional games, their development throughout recorded human history, and their relationship to the development of human culture and mathematical knowledge. This work is being explored in the ERC-funded Digital Ludeme Project. The aim of this inaugural international research meeting on DAL is to gather together leading experts in relevant disciplines - computer science, artificial intelligence, machine learning, computational phylogenetics, mathematics, history, archaeology, anthropology, etc. - to discuss the key themes and establish the foundations for this new field of research, so that it may continue beyond the lifetime of its initiating project.

LGMay 14, 2019

Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

Dennis J. N. J. Soemers, Éric Piette, Matthew Stephenson et al.

In recent years, state-of-the-art game-playing agents often involve policies that are trained in self-playing processes where Monte Carlo tree search (MCTS) algorithms and trained policies iteratively improve each other. The strongest results have been obtained when policies are trained to mimic the search behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design, includes an element of exploration, policies trained in this manner are also likely to exhibit a similar extent of exploration. In this paper, we are interested in learning policies for a project with future goals including the extraction of interpretable strategies, rather than state-of-the-art game-playing performance. For these goals, we argue that such an extent of exploration is undesirable, and we propose a novel objective function for training policies that are not exploratory. We derive a policy gradient expression for maximising this objective function, which can be estimated using MCTS value estimates, rather than MCTS visit counts. We empirically evaluate various properties of resulting policies, in a variety of board games.

AIMay 13, 2019

Ludii -- The Ludemic General Game System

Éric Piette, Dennis J. N. J. Soemers, Matthew Stephenson et al.

While current General Game Playing (GGP) systems facilitate useful research in Artificial Intelligence (AI) for game-playing, they are often somewhat specialised and computationally inefficient. In this paper, we describe the "ludemic" general game system Ludii, which has the potential to provide an efficient tool for AI researchers as well as game designers, historians, educators and practitioners in related fields. Ludii defines games as structures of ludemes -- high-level, easily understandable game concepts -- which allows for concise and human-understandable game descriptions. We formally describe Ludii and outline its main benefits: generality, extensibility, understandability and efficiency. Experimentally, Ludii outperforms one of the most efficient Game Description Language (GDL) reasoners, based on a propositional network, in all games available in the Tiltyard GGP repository. Moreover, Ludii is also competitive in terms of performance with the more recently proposed Regular Boardgames (RBG) system, and has various advantages in qualitative aspects such as generality.

AIMar 21, 2019

Biasing MCTS with Features for General Games

Dennis J. N. J. Soemers, Éric Piette, Cameron Browne

This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.