Stefan Edelkamp

AI
h-index3
14papers
52citations
Novelty41%
AI Score47

14 Papers

LGSep 12, 2022
A Differentiable Loss Function for Learning Heuristics in A*

Leah Chrestien, Tomas Pevny, Antonin Komenda et al.

Optimization of heuristic functions for the A* algorithm, realized by deep neural networks, is usually done by minimizing square root loss of estimate of the cost to goal values. This paper argues that this does not necessarily lead to a faster search of A* algorithm since its execution relies on relative values instead of absolute ones. As a mitigation, we propose a L* loss, which upper-bounds the number of excessively expanded states inside the A* search. The L* loss, when used in the optimization of state-of-the-art deep neural networks for automated planning in maze domains like Sokoban and maze with teleports, significantly improves the fraction of solved problems, the quality of founded plans, and reduces the number of expanded states to approximately 50%

9.5QUANT-PHMay 17
Imperfect-Information Games on Quantum Computers: A Case Study in Skat

Ulrich Armbrüster, Stefan Edelkamp, Gabriel Maresch et al.

For decades it is known that Quantum Computers might serve as a tool to solve a very specific kind of problems that have long thought to be incalculable. Some of those problems are of a combinatorial nature, with the quantum advantage arising from the exploding size of a huge decision tree. Although this is of high interest as well, there are more opportunities to make use of the quantum advantage among non-perfect information games with a limited amount of steps within the game. Even though it is not possible to answer the question for the winning move in a specific situation, people are rather interested in what choice gives the best outcome in the long run. This leads us to the search for the highest number of paths within the game's decision tree despite the lack of information and, thus, to a maximum of the payoff-function. We want to illustrate on how Quantum Computers can play a significant role in solving these kind of games, using an example of the most popular German card game Skat. Therefore we use quantum registers to encode the game's information properly and construct the corresponding quantum gates in order to model the game progress and obey the rules. Finally, we use a score operator to project the quantum state onto the winning subspace and therefore evaluate the winning probability for each alternative decision by the player to be made by using quantum algorithms, such as quantum counting of the winning paths to gain a possible advantage in computation speed over classical approaches. Thus, we get a reasonable recommendation of how to act at the table due to the payoff-function maximization. This approach is clearly not doable on a classical computer due to the huge tree-search problem and we discuss peculiarities of the problem that may lead to a quantum advantage when exceeding a certain problem size.

AIOct 30, 2023
Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Leah Chrestien, Tomás Pevný, Stefan Edelkamp et al.

In imitation learning for planning, parameters of heuristic functions are optimized against a set of solved problem instances. This work revisits the necessary and sufficient conditions of strictly optimally efficient heuristics for forward search algorithms, mainly A* and greedy best-first search, which expand only states on the returned optimal path. It then proposes a family of loss functions based on ranking tailored for a given variant of the forward search algorithm. Furthermore, from a learning theory point of view, it discusses why optimizing cost-to-goal \hstar\ is unnecessarily difficult. The experimental comparison on a diverse set of problems unequivocally supports the derived theory.

RONov 6, 2023
CLIP-Motion: Learning Reward Functions for Robotic Actions Using Consecutive Observations

Xuzhe Dang, Stefan Edelkamp

This paper presents a novel method for learning reward functions for robotic motions by harnessing the power of a CLIP-based model. Traditional reward function design often hinges on manual feature engineering, which can struggle to generalize across an array of tasks. Our approach circumvents this challenge by capitalizing on CLIP's capability to process both state features and image inputs effectively. Given a pair of consecutive observations, our model excels in identifying the motion executed between them. We showcase results spanning various robotic activities, such as directing a gripper to a designated target and adjusting the position of a cube. Through experimental evaluations, we underline the proficiency of our method in precisely deducing motion and its promise to enhance reinforcement learning training in the realm of robotics.

ROJan 29, 2025
Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching

Xuzhe Dang, Lada Kudláčková, Stefan Edelkamp

Automating the generation of Planning Domain Definition Language (PDDL) with Large Language Model (LLM) opens new research topic in AI planning, particularly for complex real-world tasks. This paper introduces Image2PDDL, a novel framework that leverages Vision-Language Models (VLMs) to automatically convert images of initial states and descriptions of goal states into PDDL problems. By providing a PDDL domain alongside visual inputs, Imasge2PDDL addresses key challenges in bridging perceptual understanding with symbolic planning, reducing the expertise required to create structured problem instances, and improving scalability across tasks of varying complexity. We evaluate the framework on various domains, including standard planning domains like blocksworld and sliding tile puzzles, using datasets with multiple difficulty levels. Performance is assessed on syntax correctness, ensuring grammar and executability, and content correctness, verifying accurate state representation in generated PDDL problems. The proposed approach demonstrates promising results across diverse task complexities, suggesting its potential for broader applications in AI planning. We will discuss a potential use case in robot-assisted teaching of students with Autism Spectrum Disorder.

MADec 17, 2025
Solving Multi-Agent Multi-Goal Path Finding Problems in Polynomial Time

Stefan Edelkamp

In this paper, we plan missions for a fleet of agents in undirected graphs, such as grids, with multiple goals. In contrast to regular multi-agent path-finding, the solver finds and updates the assignment of goals to the agents on its own. In the continuous case for a point agent with motions in the Euclidean plane, the problem can be solved arbitrarily close to optimal. For discrete variants that incur node and edge conflicts, we show that it can be solved in polynomial time, which is unexpected, since traditional vehicle routing on general graphs is NP-hard. We implement a corresponding planner that finds conflict-free optimized routes for the agents. Global assignment strategies greatly reduce the number of conflicts, with the remaining ones resolved by elaborating on the concept of ants-on-the-stick, by solving local assignment problems, by interleaving agent paths, and by kicking agents that have already arrived out of their destinations

AIFeb 9
Intermediate Results on the Complexity of STRIPS$_{1}^{1}$

Stefan Edelkamp, Jiří Fink, Petr Gregor et al.

This paper is based on Bylander's results on the computational complexity of propositional STRIPS planning. He showed that when only ground literals are permitted, determining plan existence is PSPACE-complete even if operators are limited to two preconditions and two postconditions. While NP-hardness is settled, it is unknown whether propositional STRIPS with operators that only have one precondition and one effect is NP-complete. We shed light on the question whether this small solution hypothesis for STRIPS$^1_1$ is true, calling a SAT solver for small instances, introducing the literal graph, and mapping it to Petri nets.

AIDec 17, 2025
Outer-Learning Framework for Playing Multi-Player Trick-Taking Card Games: A Case Study in Skat

Stefan Edelkamp

In multi-player card games such as Skat or Bridge, the early stages of the game, such as bidding, game selection, and initial card selection, are often more critical to the success of the play than refined middle- and end-game play. At the current limits of computation, such early decision-making resorts to using statistical information derived from a large corpus of human expert games. In this paper, we derive and evaluate a general bootstrapping outer-learning framework that improves prediction accuracy by expanding the database of human games with millions of self-playing AI games to generate and merge statistics. We implement perfect feature hash functions to address compacted tables, producing a self-improving card game engine, where newly inferred knowledge is continuously improved during self-learning. The case study in Skat shows that the automated approach can be used to support various decisions in the game.

AIDec 3, 2021
Heuristic Search Planning with Deep Neural Networks using Imitation, Attention and Curriculum Learning

Leah Chrestien, Tomas Pevny, Antonin Komenda et al.

Learning a well-informed heuristic function for hard task planning domains is an elusive problem. Although there are known neural network architectures to represent such heuristic knowledge, it is not obvious what concrete information is learned and whether techniques aimed at understanding the structure help in improving the quality of the heuristics. This paper presents a network model to learn a heuristic capable of relating distant parts of the state space via optimal plan imitation using the attention mechanism, which drastically improves the learning of a good heuristic function. To counter the limitation of the method in the creation of problems of increasing difficulty, we demonstrate the use of curriculum learning, where newly solved problem instances are added to the training set, which, in turn, helps to solve problems of higher complexities and far exceeds the performances of all existing baselines including classical planning heuristics. We demonstrate its effectiveness for grid-type PDDL domains.

AIApr 7, 2021
Knowledge-Based Paranoia Search in Trick-Taking

Stefan Edelkamp

This paper proposes \emph{knowledge-based paraonoia search} (KBPS) to find forced wins during trick-taking in the card game Skat; for some one of the most interesting card games for three players. It combines efficient partial information game-tree search with knowledge representation and reasoning. This worst-case analysis, initiated after a small number of tricks, leads to a prioritized choice of cards. We provide variants of KBPS for the declarer and the opponents, and an approximation to find a forced win against most worlds in the belief space. Replaying thousands of expert games, our evaluation indicates that the AIs with the new algorithms perform better than humans in their play, achieving an average score of over 1,000 points in the agreed standard for evaluating Skat tournaments, the extended Seeger system.

AIApr 7, 2021
On the Power of Refined Skat Selection

Stefan Edelkamp

Skat is a fascinating combinatorial card game, show-casing many of the intrinsic challenges for modern AI systems such as cooperative and adversarial behaviors (among the players), randomness (in the deal), and partial knowledge (due to hidden cards). Given the larger number of tricks and higher degree of uncertainty, reinforcement learning is less effective compared to classical board games like Chess and Go. As within the game of Bridge, in Skat we have a bidding and trick-taking stage. Prior to the trick-taking and as part of the bidding process, one phase in the game is to select two skat cards, whose quality may influence subsequent playing performance drastically. This paper looks into different skat selection strategies. Besides predicting the probability of winning and other hand strength functions we propose hard expert-rules and a scoring functions based on refined skat evaluation features. Experiments emphasize the impact of the refined skat putting algorithm on the playing performance of the bots, especially for AI bidding and AI game selection.

AIApr 7, 2021
ELO System for Skat and Other Games of Chance

Stefan Edelkamp

Assessing the skill level of players to predict the outcome and to rank the players in a longer series of games is of critical importance for tournament play. Besides weaknesses, like an observed continuous inflation, through a steadily increasing playing body, the ELO ranking system, named after its creator Arpad Elo, has proven to be a reliable method for calculating the relative skill levels of players in zero-sum games. The evaluation of player strength in trick-taking card games like Skat or Bridge, however, is not obvious. Firstly, these are incomplete information partially observable games with more than one player, where opponent strength should influence the scoring as it does in existing ELO systems. Secondly, they are game of both skill and chance, so that besides the playing strength the outcome of a game also depends on the deal. Last but not least, there are internationally established scoring systems, in which the players are used to be evaluated, and to which ELO should align. Based on a tournament scoring system, we propose a new ELO system for Skat to overcome these weaknesses.

DSDec 26, 2013
Proceedings 2nd Workshop on GRAPH Inspection and Traversal Engineering

Anton Wijs, Dragan Bošnački, Stefan Edelkamp

These are the proceedings of the Second Workshop on GRAPH Inspection and Traversal Engineering (GRAPHITE 2013), which took place on March 24, 2013 in Rome, Italy, as a satellite event of the 16th European Joint Conferences on Theory and Practice of Software (ETAPS 2013). The topic of the GRAPHITE workshop is graph analysis in all its forms in computer science. Graphs are used to represent data in many application areas, and they are subjected to various computational algorithms in order to acquire the desired information. These graph algorithms tend to have common characteristics, such as duplicate detection to guarantee their termination, independent of their application domain. Over the past few years, it has been shown that the scalability of such algorithms can be dramatically improved by using, e.g., external memory, by exploiting parallel architectures, such as clusters, multi-core CPUs, and graphics processing units, and by using heuristics to guide the search. Novel techniques to further scale graph search algorithms, and new applications of graph search are within the scope of this workshop. Another topic of interest of the event is more related to the structural properties of graphs: which kind of graph characteristics are relevant for a particular application area, and how can these be measured? Finally, any novel way of using graphs for a particular application area is on topic. The goal of this event is to gather scientists from different communities, such as model checking, artificial intelligence planning, game playing, and algorithm engineering, who do research on graph search algorithms, such that awareness of each others' work is increased.

AIOct 24, 2012
Lex-Partitioning: A New Option for BDD Search

Stefan Edelkamp, Peter Kissmann, Álvaro Torralba

For the exploration of large state spaces, symbolic search using binary decision diagrams (BDDs) can save huge amounts of memory and computation time. State sets are represented and modified by accessing and manipulating their characteristic functions. BDD partitioning is used to compute the image as the disjunction of smaller subimages. In this paper, we propose a novel BDD partitioning option. The partitioning is lexicographical in the binary representation of the states contained in the set that is represented by a BDD and uniform with respect to the number of states represented. The motivation of controlling the state set sizes in the partitioning is to eventually bridge the gap between explicit and symbolic search. Let n be the size of the binary state vector. We propose an O(n) ranking and unranking scheme that supports negated edges and operates on top of precomputed satcount values. For the uniform split of a BDD, we then use unranking to provide paths along which we partition the BDDs. In a shared BDD representation the efforts are O(n). The algorithms are fully integrated in the CUDD library and evaluated in strongly solving general game playing benchmarks.