AIMay 29, 2022
A Framework for Generating Informative Benchmark InstancesNguyen Dang, Özgür Akgün, Joan Espasa et al.
Benchmarking is an important tool for assessing the relative performance of alternative solving approaches. However, the utility of benchmarking is limited by the quantity and quality of the available problem instances. Modern constraint programming languages typically allow the specification of a class-level model that is parameterised over instance data. This separation presents an opportunity for automated approaches to generate instance data that define instances that are graded (solvable at a certain difficulty level for a solver) or can discriminate between two solving approaches. In this paper, we introduce a framework that combines these two properties to generate a large number of benchmark instances, purposely generated for effective and informative benchmarking. We use five problems that were used in the MiniZinc competition to demonstrate the usage of our framework. In addition to producing a ranking among solvers, our framework gives a broader understanding of the behaviour of each solver for the whole instance space; for example by finding subsets of instances where the solver performance significantly varies from its average performance.
AIOct 2, 2023
Bridging the Gap between Structural and Semantic Similarity in Diverse PlanningMustafa F. Abdelwahed, Joan Espasa, Alice Toniolo et al.
Diverse planning is the problem of finding multiple plans for a given problem specification, which is at the core of many real-world applications. For example, diverse planning is a critical piece for the efficiency of plan recognition systems when dealing with noisy and missing observations. Providing diverse solutions can also benefit situations where constraints are too expensive or impossible to model. Current diverse planners operate by generating multiple plans and then applying a selection procedure to extract diverse solutions using a similarity metric. Generally, current similarity metrics only consider the structural properties of the given plans. We argue that this approach is a limitation that sometimes prevents such metrics from capturing why two plans differ. In this work, we propose two new domain-independent metrics which are able to capture relevant information on the difference between two given plans from a domain-dependent viewpoint. We showcase their utility in various situations where the currently used metrics fail to capture the similarity between plans, failing to capture some structural symmetries.
AIOct 2, 2023
Towards a Model of PuzznicJoan Espasa, Ian P. Gent, Ian Miguel et al.
We report on progress in modelling and solving Puzznic, a video game requiring the player to plan sequences of moves to clear a grid by matching blocks. We focus here on levels with no moving blocks. We compare a planning approach and three constraint programming approaches on a small set of benchmark instances. The planning approach is at present superior to the constraint programming approaches, but we outline proposals for improving the constraint models.
AIJan 8
From Stories to Cities to Games: A Qualitative Evaluation of Behaviour PlanningMustafa F. Abdelwahed, Joan Espasa, Alice Toniolo et al.
The primary objective of a diverse planning approach is to generate a set of plans that are distinct from one another. Such an approach is applied in a variety of real-world domains, including risk management, automated stream data analysis, and malware detection. More recently, a novel diverse planning paradigm, referred to as behaviour planning, has been proposed. This approach extends earlier methods by explicitly incorporating a diversity model into the planning process and supporting multiple planning categories. In this paper, we demonstrate the usefulness of behaviour planning in real-world settings by presenting three case studies. The first case study focuses on storytelling, the second addresses urban planning, and the third examines game evaluation.
AIOct 2, 2023
Challenges in Modelling and Solving Plotting with PDDLJoan Espasa, Ian Miguel, Peter Nightingale et al.
We study a planning problem based on Plotting, a tile-matching puzzle video game published by Taito in 1989. The objective of this game is to remove a target number of coloured blocks from a grid by sequentially shooting blocks into the grid. Plotting features complex transitions after every shot: various blocks are affected directly, while others can be indirectly affected by gravity. We highlight the challenges of modelling Plotting with PDDL and of solving it with a grounding-based state-of-the-art planner.
AIOct 2, 2023
A Good Snowman is Hard to PlanMiquel Bofill, Cristina Borralleras, Joan Espasa et al.
In this work we face a challenging puzzle video game: A Good Snowman is Hard to Build. The objective of the game is to build snowmen by moving and stacking snowballs on a discrete grid. For the sake of player engagement with the game, it is interesting to avoid that a player finds a much easier solution than the one the designer expected. Therefore, having tools that are able to certify the optimality of solutions is crucial. Although the game can be stated as a planning problem and can be naturally modelled in PDDL, we show that a direct translation to SAT clearly outperforms off-the-shelf state-of-the-art planners. As we show, this is mainly due to the fact that reachability properties can be easily modelled in SAT, allowing for shorter plans, whereas using axioms to express a reachability derived predicate in PDDL does not result in any significant reduction of solving time with the considered planners. We deal with a set of 51 levels, both original and crafted, solving 43 and with 8 challenging instances still remaining to be solved.
AIOct 2, 2023
On Grid Graph Reachability and Puzzle GamesMiquel Bofill, Cristina Borralleras, Joan Espasa et al.
Many puzzle video games, like Sokoban, involve moving some agent in a maze. The reachable locations are usually apparent for a human player, and the difficulty of the game is mainly related to performing actions on objects, such as pushing (reachable) boxes. For this reason, the difficulty of a particular level is often measured as the number of actions on objects, other than agent walking, needed to find a solution. In this paper we study CP and SAT approaches for solving these kind of problems. We review some reachability encodings and propose a new one. We empirically show that the new encoding is well-suited for solving puzzle problems in the planning as SAT paradigm, especially when considering the execution of several actions in parallel.
AIOct 2, 2023
Towards Automatic Design of Factorio BlueprintsSean Patterson, Joan Espasa, Mun See Chang et al.
Factorio is a 2D construction and management simulation video game about building automated factories to produce items of increasing complexity. A core feature of the game is its blueprint system, which allows players to easily save and replicate parts of their designs. Blueprints can reproduce any layout of objects in the game, but are typically used to encapsulate a complex behaviour, such as the production of a non-basic object. Once created, these blueprints are then used as basic building blocks, allowing the player to create a layer of abstraction. The usage of blueprints not only eases the expansion of the factory but also allows the sharing of designs with the game's community. The layout in a blueprint can be optimised using various criteria, such as the total space used or the final production throughput. The design of an optimal blueprint is a hard combinatorial problem, interleaving elements of many well-studied problems such as bin-packing, routing or network design. This work presents a new challenging problem and explores the feasibility of a constraint model to optimise Factorio blueprints, balancing correctness, optimality, and performance.
AIFeb 16
Removing Planner Bias in Goal Recognition Through Multi-Plan Dataset GenerationMustafa F. Abdelwahed, Felipe Meneguzzi Kin Max Piamolini Gusmao, Joan Espasa
Autonomous agents require some form of goal and plan recognition to interact in multiagent settings. Unfortunately, all existing goal recognition datasets suffer from a systematical bias induced by the planning systems that generated them, namely heuristic-based forward search. This means that existing datasets lack enough challenge for more realistic scenarios (e.g., agents using different planners), which impacts the evaluation of goal recognisers with respect to using different planners for the same goal. In this paper, we propose a new method that uses top-k planning to generate multiple, different, plans for the same goal hypothesis, yielding benchmarks that mitigate the bias found in the current dataset. This allows us to introduce a new metric called Version Coverage Score (VCS) to measure the resilience of the goal recogniser when inferring a goal based on different sets of plans. Our results show that the resilience of the current state-of-the-art goal recogniser degrades substantially under low observability settings.
AIMay 7, 2024
Behaviour Planning: A Toolkit for Diverse PlanningMustafa F Abdelwahed, Joan Espasa, Alice Toniolo et al.
Diverse planning approaches are utilised in real-world applications like risk management, automated streamed data analysis, and malware detection. The current diverse planning formulations encode the diversity model as a distance function, which is computational inexpensive when comparing two plans. However, such modelling approach limits what can be encoded as measure of diversity, as well as the ability to explain why two plans are different. This paper introduces a novel approach to the diverse planning problem, allowing for more expressive modelling of diversity using a n-dimensional grid representation, where each dimension corresponds to a user-defined feature. Furthermore, we present a novel toolkit that generates diverse plans based on such customisable diversity models, called \emph{Behaviour Planning}. We provide an implementation for behaviour planning using planning-as-satisfiability. An empirical evaluation of our implementation shows that behaviour planning significantly outperforms the current diverse planning method in generating diverse plans measured on our new customisable diversity models. Our implementation is the first diverse planning approach to support planning categories beyond classical planning, such as over-subscription and numerical planning.
AIOct 20, 2025
Diverse Planning with Simulators via Linear Temporal LogicMustafa F. Abdelwahed, Alice Toniolo, Joan Espasa et al.
Autonomous agents rely on automated planning algorithms to achieve their objectives. Simulation-based planning offers a significant advantage over declarative models in modelling complex environments. However, relying solely on a planner that produces a single plan may not be practical, as the generated plans may not always satisfy the agent's preferences. To address this limitation, we introduce $\texttt{FBI}_\texttt{LTL}$, a diverse planner explicitly designed for simulation-based planning problems. $\texttt{FBI}_\texttt{LTL}$ utilises Linear Temporal Logic (LTL) to define semantic diversity criteria, enabling agents to specify what constitutes meaningfully different plans. By integrating these LTL-based diversity models directly into the search process, $\texttt{FBI}_\texttt{LTL}$ ensures the generation of semantically diverse plans, addressing a critical limitation of existing diverse planning approaches that may produce syntactically different but semantically identical solutions. Extensive evaluations on various benchmarks consistently demonstrate that $\texttt{FBI}_\texttt{LTL}$ generates more diverse plans compared to a baseline approach. This work establishes the feasibility of semantically-guided diverse planning in simulation-based environments, paving the way for innovative approaches in realistic, non-symbolic domains where traditional model-based approaches fail.
AIOct 27, 2021
A Preliminary Case Study of Planning With Complex Transitions: PlottingJordi Coll, Joan Espasa, Ian Miguel et al.
Plotting is a tile-matching puzzle video game published by Taito in 1989. Its objective is to reduce a given grid of coloured blocks down to a goal number or fewer. This is achieved by the avatar character repeatedly shooting the block it holds into the grid. Plotting is an example of a planning problem: given a model of the environment, a planning problem asks us to find a sequence of actions that can lead from an initial state of the environment to a given goal state while respecting some constraints. The key difficulty in modelling Plotting is in capturing the way the puzzle state changes after each shot. A single shot can affect multiple tiles directly, and the grid is affected by gravity so numerous other tiles can be affected indirectly. We present and evaluate a constraint model of the Plotting problem that captures this complexity. We also discuss the difficulties and inefficiencies of modelling Plotting in PDDL, the standard language used for input to specialised AI planners. We conclude by arguing that AI planning could benefit from a richer modelling language.
AIApr 30, 2021
Using Small MUSes to Explain How to Solve Pen and Paper PuzzlesJoan Espasa, Ian P. Gent, Ruth Hoffmann et al.
In this paper, we present Demystify, a general tool for creating human-interpretable step-by-step explanations of how to solve a wide range of pen and paper puzzles from a high-level logical description. Demystify is based on Minimal Unsatisfiable Subsets (MUSes), which allow Demystify to solve puzzles as a series of logical deductions by identifying which parts of the puzzle are required to progress. This paper makes three contributions over previous work. First, we provide a generic input language, based on the Essence constraint language, which allows us to easily use MUSes to solve a much wider range of pen and paper puzzles. Second, we demonstrate that the explanations that Demystify produces match those provided by humans by comparing our results with those provided independently by puzzle experts on a range of puzzles. We compare Demystify to published guides for solving a range of different pen and paper puzzles and show that by using MUSes, Demystify produces solving strategies which closely match human-produced guides to solving those same puzzles (on average 89% of the time). Finally, we introduce a new randomised algorithm to find MUSes for more difficult puzzles. This algorithm is focused on optimised search for individual small MUSes.
AISep 21, 2020
Exploring Instance Generation for Automated PlanningÖzgür Akgün, Nguyen Dang, Joan Espasa et al.
Many of the core disciplines of artificial intelligence have sets of standard benchmark problems well known and widely used by the community when developing new algorithms. Constraint programming and automated planning are examples of these areas, where the behaviour of a new algorithm is measured by how it performs on these instances. Typically the efficiency of each solving method varies not only between problems, but also between instances of the same problem. Therefore, having a diverse set of instances is crucial to be able to effectively evaluate a new solving method. Current methods for automatic generation of instances for Constraint Programming problems start with a declarative model and search for instances with some desired attributes, such as hardness or size. We first explore the difficulties of adapting this approach to generate instances starting from problem specifications written in PDDL, the de-facto standard language of the automated planning community. We then propose a new approach where the whole planning problem description is modelled using Essence, an abstract modelling language that allows expressing high-level structures without committing to a particular low level representation in PDDL.