Abdallah Saffidine

AI
h-index27
16papers
68citations
Novelty51%
AI Score40

16 Papers

AIOct 26, 2022
HSVI can solve zero-sum Partially Observable Stochastic Games

Aurélien Delage, Olivier Buffet, Jilles S. Dibangoye et al.

State-of-the-art methods for solving 2-player zero-sum imperfect information games rely on linear programming or regret minimization, though not on dynamic programming (DP) or heuristic search (HS), while the latter are often at the core of state-of-the-art solvers for other sequential decision-making problems. In partially observable or collaborative settings (e.g., POMDPs and Dec- POMDPs), DP and HS require introducing an appropriate statistic that induces a fully observable problem as well as bounding (convex) approximators of the optimal value function. This approach has succeeded in some subclasses of 2-player zero-sum partially observable stochastic games (zs- POSGs) as well, but how to apply it in the general case still remains an open question. We answer it by (i) rigorously defining an equivalent game to work with, (ii) proving mathematical properties of the optimal value function that allow deriving bounds that come with solution strategies, (iii) proposing for the first time an HSVI-like solver that provably converges to an $ε$-optimal solution in finite time, and (iv) empirically analyzing it. This opens the door to a novel family of promising approaches complementing those relying on linear programming or iterative methods.

AIJan 18, 2023
Implicit State and Goals in QBF Encodings for Positional Games (extended version)

Irfansha Shaik, Valentin Mayer-Eichberger, Jaco van de Pol et al.

We address two bottlenecks for concise QBF encodings of maker-breaker positional games, like Hex and Tic-Tac-Toe. Our baseline is a QBF encoding with explicit variables for board positions and an explicit representation of winning configurations. The first improvement is inspired by lifted planning and avoids variables for explicit board positions, introducing a universal quantifier representing a symbolic board state. The second improvement represents the winning configurations implicitly, exploiting their structure. The paper evaluates the size of several encodings, depending on board size and game depth. It also reports the performance of QBF solvers on these encodings. We evaluate the techniques on Hex instances and also apply them to Harary's Tic-Tac-Toe. In particular, we study scalability to 19$\times$19 boards, played in human Hex tournaments.

AIFeb 26, 2023
Towards Tackling MaxSAT by Combining Nested Monte Carlo with Local Search

Hui Wang, Abdallah Saffidine, Tristan Cazenave

Recent work proposed the UCTMAXSAT algorithm to address Maximum Satisfiability Problems (MaxSAT) and shown improved performance over pure Stochastic Local Search algorithms (SLS). UCTMAXSAT is based on Monte Carlo Tree Search but it uses SLS instead of purely random playouts. In this work, we introduce two algorithmic variations over UCTMAXSAT. We carry an empirical analysis on MaxSAT benchmarks from recent competitions and establish that both ideas lead to performance improvements. First, a nesting of the tree search inspired by the Nested Monte Carlo Search algorithm is effective on most instance types in the benchmark. Second, we observe that using a static flip limit in SLS, the ideal budget depends heavily on the instance size and we propose to set it dynamically. We show that it is a robust way to achieve comparable performance on a variety of instances without requiring additional tuning.

AISep 22, 2023
Vision Transformers for Computer Go

Amani Sagri, Tristan Cazenave, Jérôme Arjonilla et al.

Motivated by the success of transformers in various fields, such as language understanding and image analysis, this investigation explores their application in the context of the game of Go. In particular, our study focuses on the analysis of the Transformer in Vision. Through a detailed analysis of numerous points such as prediction accuracy, win rates, memory, speed, size, or even learning rate, we have been able to highlight the substantial role that transformers can play in the game of Go. This study was carried out by comparing them to the usual Residual Networks.

19.5CCApr 14
Refuting the Direct Sum Conjecture for Total Functions in Deterministic Communication Complexity

Simon Mackenzie, Abdallah Saffidine

In communication complexity the input of a function $f:X\times Y\rightarrow Z$ is distributed between two players Alice and Bob. If Alice knows only $x\in X$ and Bob only $y\in Y$, how much information must Alice and Bob share to be able to elicit the value of $f(x,y)$? Do we need $\ell$ more resources to solve $\ell$ instances of a problem? This question is the direct sum question and has been studied in many computational models. In this paper we focus on the case of 2-party deterministic communication complexity and give a counterexample to the direct sum conjecture in its strongest form. To do so we exhibit a family of functions for which the complexity of solving $\ell$ instances is less than $(1 -ε)\ell$ times the complexity of solving one instance for some small enough $ε>0$. We use a customised method in the analysis of our family of total functions, showing that one can force the alternation of rounds between players. This idea allows us to exploit the integrality of the complexity measure to create an increasing gap between the complexity of solving the instances independently with that of solving them together.

SISep 3, 2024
Fair Railway Network Design

Zixu He, Sirin Botan, Jérôme Lang et al.

When designing a public transportation network in a country, one may want to minimise the sum of travel duration of all inhabitants. This corresponds to a purely utilitarian view and does not involve any fairness consideration, as the resulting network will typically benefit the capital city and/or large central cities while leaving some peripheral cities behind. On the other hand, a more egalitarian view will allow some people to travel between peripheral cities without having to go through a central city. We define a model, propose algorithms for computing solution networks, and report on experiments based on real data.

AIAug 5, 2024
Perfect Information Monte Carlo with Postponing Reasoning

Jérôme Arjonilla, Abdallah Saffidine, Tristan Cazenave

Imperfect information games, such as Bridge and Skat, present challenges due to state-space explosion and hidden information, posing formidable obstacles for search algorithms. Determinization-based algorithms offer a resolution by sampling hidden information and solving the game in a perfect information setting, facilitating rapid and effective action estimation. However, transitioning to perfect information introduces challenges, notably one called strategy fusion.This research introduces `Extended Perfect Information Monte Carlo' (EPIMC), an online algorithm inspired by the state-of-the-art determinization-based approach Perfect Information Monte Carlo (PIMC). EPIMC enhances the capabilities of PIMC by postponing the perfect information resolution, reducing alleviating issues related to strategy fusion. However, the decision to postpone the leaf evaluator introduces novel considerations, such as the interplay between prior levels of reasoning and the newly deferred resolution. In our empirical analysis, we investigate the performance of EPIMC across a range of games, with a particular focus on those characterized by varying degrees of strategy fusion. Our results demonstrate notable performance enhancements, particularly in games where strategy fusion significantly impacts gameplay. Furthermore, our research contributes to the theoretical foundation of determinization-based algorithms addressing challenges associated with strategy fusion.%, thereby enhancing our understanding of these algorithms within the context of imperfect information game scenarios.

AIAug 19, 2024
Enhancing Reinforcement Learning Through Guided Search

Jérôme Arjonilla, Abdallah Saffidine, Tristan Cazenave

With the aim of improving performance in Markov Decision Problem in an Off-Policy setting, we suggest taking inspiration from what is done in Offline Reinforcement Learning (RL). In Offline RL, it is a common practice during policy learning to maintain proximity to a reference policy to mitigate uncertainty, reduce potential policy errors, and help improve performance. We find ourselves in a different setting, yet it raises questions about whether a similar concept can be applied to enhance performance ie, whether it is possible to find a guiding policy capable of contributing to performance improvement, and how to incorporate it into our RL agent. Our attention is particularly focused on algorithms based on Monte Carlo Tree Search (MCTS) as a guide.MCTS renowned for its state-of-the-art capabilities across various domains, catches our interest due to its ability to converge to equilibrium in single-player and two-player contexts. By harnessing the power of MCTS as a guide for our RL agent, we observed a significant performance improvement, surpassing the outcomes achieved by utilizing each method in isolation. Our experiments were carried out on the Atari 100k benchmark.

AIMay 23, 2024
Mixture of Public and Private Distributions in Imperfect Information Games

Jérôme Arjonilla, Abdallah Saffidine, Tristan Cazenave

In imperfect information games (e.g. Bridge, Skat, Poker), one of the fundamental considerations is to infer the missing information while at the same time avoiding the disclosure of private information. Disregarding the issue of protecting private information can lead to a highly exploitable performance. Yet, excessive attention to it leads to hesitations that are no longer consistent with our private information. In our work, we show that to improve performance, one must choose whether to use a player's private information. We extend our work by proposing a new belief distribution depending on the amount of private and public information desired. We empirically demonstrate an increase in performance and, with the aim of further improving performance, the new distribution should be used according to the position in the game. Our experiments have been done on multiple benchmarks and in multiple determinization-based algorithms (PIMC and IS-MCTS).

AIMay 23, 2024
Deep Reinforcement Learning for 5*5 Multiplayer Go

Brahim Driss, Jérôme Arjonilla, Hui Wang et al.

In recent years, much progress has been made in computer Go and most of the results have been obtained thanks to search algorithms (Monte Carlo Tree Search) and Deep Reinforcement Learning (DRL). In this paper, we propose to use and analyze the latest algorithms that use search and DRL (AlphaZero and Descent algorithms) to automatically learn to play an extended version of the game of Go with more than two players. We show that using search and DRL we were able to improve the level of play, even though there are more than two players.

AIJun 29, 2020
On Bellman's Optimality Principle for zs-POSGs

Olivier Buffet, Jilles Dibangoye, Aurélien Delage et al.

Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i.e., exploiting the fact that sub-problems are nested recursively within the original problem. Here we show how it can apply to (infinite horizon) 2-player zero-sum partially observable stochastic games (zs-POSGs) by (i) taking a central planner's viewpoint, which can only reason on a sufficient statistic called occupancy state, and (ii) turning such problems into zero-sum occupancy Markov games (zs-OMGs). Then, exploiting the Lipschitz-continuity of the value function in occupancy space, one can derive a version of the HSVI algorithm (Heuristic Search Value Iteration) that provably finds an $ε$-Nash equilibrium in finite time.

LOMay 11, 2020
Positional Games and QBF: A Polished Encoding

Valentin Mayer-Eichberger, Abdallah Saffidine

Positional games are a mathematical class of two-player games comprising Tic-tac-toe and its generalizations. We propose a novel encoding of these games into Quantified Boolean Formulas (QBFs) such that a game instance admits a winning strategy for the first player if and only if the corresponding formula is true. Our approach improves over previous QBF encodings of games in multiple ways. First, it is generic and lets us encode other positional games, such as Hex. Second, the structural properties of positional games, together with careful treatment of illegal moves, let us generate more compact instances that can be solved faster by state-of-the-art QBF solvers. We establish the latter fact through extensive experiments. Finally, the compactness of our new encoding makes it feasible to translate realistic game problems. We identify a few such problems of historical significance and put them forward to the QBF community as milestones of increasing difficulty.

AIMay 31, 2019
Foundations of Digital Archæoludology

Cameron Browne, Dennis J. N. J. Soemers, Éric Piette et al.

Digital Archaeoludology (DAL) is a new field of study involving the analysis and reconstruction of ancient games from incomplete descriptions and archaeological evidence using modern computational techniques. The aim is to provide digital tools and methods to help game historians and other researchers better understand traditional games, their development throughout recorded human history, and their relationship to the development of human culture and mathematical knowledge. This work is being explored in the ERC-funded Digital Ludeme Project. The aim of this inaugural international research meeting on DAL is to gather together leading experts in relevant disciplines - computer science, artificial intelligence, machine learning, computational phylogenetics, mathematics, history, archaeology, anthropology, etc. - to discuss the key themes and establish the foundations for this new field of research, so that it may continue beyond the lifetime of its initiating project.

AIMay 8, 2018
The Complexity of Limited Belief Reasoning -- The Quantifier-Free Case

Yijia Chen, Abdallah Saffidine, Christoph Schwering

The classical view of epistemic logic is that an agent knows all the logical consequences of their knowledge base. This assumption of logical omniscience is often unrealistic and makes reasoning computationally intractable. One approach to avoid logical omniscience is to limit reasoning to a certain belief level, which intuitively measures the reasoning "depth." This paper investigates the computational complexity of reasoning with belief levels. First we show that while reasoning remains tractable if the level is constant, the complexity jumps to PSPACE-complete -- that is, beyond classical reasoning -- when the belief level is part of the input. Then we further refine the picture using parameterized complexity theory to investigate how the belief level and the number of non-logical symbols affect the complexity.

CCMay 16, 2016
On the Complexity of Connection Games

Édouard Bonnet, Florian Jamain, Abdallah Saffidine

In this paper, we study three connection games among the most widely played: Havannah, Twixt, and Slither. We show that determining the outcome of an arbitrary input position is PSPACE-complete in all three cases. Our reductions are based on the popular graph problem Generalized Geography and on Hex itself. We also consider the complexity of generalizations of Hex parameterized by the length of the solution and establish that while Short Generalized Hex is W[1]-hard, Short Hex is FPT. Finally, we prove that the ultra-weak solution to the empty starting position in hex cannot be fully adapted to any of these three games.

AIJul 8, 2012
Minimal Proof Search for Modal Logic K Model Checking

Abdallah Saffidine

Most modal logics such as S5, LTL, or ATL are extensions of Modal Logic K. While the model checking problems for LTL and to a lesser extent ATL have been very active research areas for the past decades, the model checking problem for the more basic Multi-agent Modal Logic K (MMLK) has important applications as a formal framework for perfect information multi-player games on its own. We present Minimal Proof Search (MPS), an effort number based algorithm solving the model checking problem for MMLK. We prove two important properties for MPS beyond its correctness. The (dis)proof exhibited by MPS is of minimal cost for a general definition of cost, and MPS is an optimal algorithm for finding (dis)proofs of minimal cost. Optimality means that any comparable algorithm either needs to explore a bigger or equal state space than MPS, or is not guaranteed to find a (dis)proof of minimal cost on every input. As such, our work relates to A* and AO* in heuristic search, to Proof Number Search and DFPN+ in two-player games, and to counterexample minimization in software model checking.