RONov 15, 2022
PARTNR: Pick and place Ambiguity Resolving by Trustworthy iNteractive leaRningJelle Luijkx, Zlatan Ajanovic, Laura Ferranti et al.
Several recent works show impressive results in mapping language-based human commands and image scene observations to direct robot executable policies (e.g., pick and place poses). However, these approaches do not consider the uncertainty of the trained policy and simply always execute actions suggested by the current policy as the most probable ones. This makes them vulnerable to domain shift and inefficient in the number of required demonstrations. We extend previous works and present the PARTNR algorithm that can detect ambiguities in the trained policy by analyzing multiple modalities in the pick and place poses using topological analysis. PARTNR employs an adaptive, sensitivity-based, gating function that decides if additional user demonstrations are required. User demonstrations are aggregated to the dataset and used for subsequent training. In this way, the policy can adapt promptly to domain shift and it can minimize the number of required demonstrations for a well-trained policy. The adaptive threshold enables to achieve the user-acceptable level of ambiguity to execute the policy autonomously and in turn, increase the trustworthiness of our system. We demonstrate the performance of PARTNR in a table-top pick and place task.
AIJun 7, 2024Code
SLOPE: Search with Learned Optimal Pruning-based ExpansionDavor Bokan, Zlatan Ajanovic, Bakir Lacevic
Heuristic search is often used for motion planning and pathfinding problems, for finding the shortest path in a graph while also promising completeness and optimal efficiency. The drawback is it's space complexity, specifically storing all expanded child nodes in memory and sorting large lists of active nodes, which can be a problem in real-time scenarios with limited on-board computation. To combat this, we present the Search with Learned Optimal Pruning-based Expansion (SLOPE), which, learns the distance of a node from a possible optimal path, unlike other approaches that learn a cost-to-go value. The unfavored nodes are then pruned according to the said distance, which in turn reduces the size of the open list. This ensures that the search explores only the region close to optimal paths while lowering memory and computational costs. Unlike traditional learning methods, our approach is orthogonal to estimating cost-to-go heuristics, offering a complementary strategy for improving search efficiency. We demonstrate the effectiveness of our approach evaluating it as a standalone search method and in conjunction with learned heuristic functions, achieving comparable-or-better node expansion metrics, while lowering the number of child nodes in the open list. Our code is available at https://github.com/dbokan1/SLOPE.
ROMar 9, 2020Code
Overview of Tools Supporting Planning for Automated DrivingKailin Tong, Zlatan Ajanovic, Georg Stettinger
Planning is an essential topic in the realm of automated driving. Besides planning algorithms that are widely covered in the literature, planning requires different software tools for its development, validation, and execution. This paper presents a survey of such tools including map representations, communication, traffic rules, open-source planning stacks and middleware, simulation, and visualization tools as well as benchmarks. We start by defining the planning task and different supporting tools. Next, we provide a comprehensive review of state-of-the-art developments and analysis of relations among them. Finally, we discuss the current gaps and suggest future research directions.
AIMay 7
Novelty-based Tree-of-Thought Search for LLM Reasoning and PlanningLeon Hamm, Zlatan Ajanovic
Although advances such as chain-of-thought, tree-of-thought or reinforcement learning have improved the performance of LLMs in reasoning and planning tasks, they are still brittle and have not achieved human-level performance in many domains, and often suffer from high time and token costs. Inspired by the success of width-based search in planning, we explore how the concept of novelty can be transferred to language domains and how it can improve tree-of-thought reasoning. A tree of thoughts relies on building possible "paths" of consecutive ideas or thoughts. These are generated by repeatedly prompting an LLM. In our paper, a measurable concept of novelty is proposed that describes the uniqueness of a new node (thought) in comparison to nodes previously seen in the search tree. Novelty is estimated by prompting an LLM and making use of embedded general knowledge from pre-training. This metric can then be used to prune branches and reduce the scope of the search. Although this method introduces more prompts per state, the overall token cost can be reduced by pruning and reducing the overall tree size. This procedure is tested and compared using several benchmarks in language-based planning and general reasoning.
ROApr 20
SYMBOLIZER: Symbolic Model-free Task Planning with VLMsSami Azirar, Zlatan Ajanovic, Hermann Blum
Traditional Task and Motion Planning (TAMP) systems depend on physics models for motion planning and discrete symbolic models for task planning. Although physics model are often available, symbolic models (consisting of symbolic state interpretation and action models) must be meticulously handcrafted or learned from labeled data. This process is both resource-intensive and constrains the solution to the specific domain, limiting scalability and adaptability. On the other hand, Visual Language Models (VLMs) show desirable zero-shot visual understanding (due to their extensive training on heterogeneous data), but still achieve limited planning capabilities. Therefore, integrating VLMs with classical planning for long-horizon reasoning in TAMP problems offers high potential. Recent works in this direction still lack generality and depend on handcrafted, task-specific solutions, e.g. describing all possible objects in advance, or using symbolic action models. We propose a framework that generalizes well to unseen problem instances. The method requires only lifted predicates describing relations among objects and uses VLMs to ground them from images to obtain the symbolic state. Planning is performed with domain-independent heuristic search using goal-count and width-based heuristics, without need for action models. Symbolic search over VLM-grounded state-space outperforms direct VLM-based planning and performs on par with approaches that use a VLM-derived heuristic. This shows that domain-independent search can effectively solve problems across domains with large combinatorial state spaces. We extensively evaluate on extensively evaluate our method and achieve state-of-the-art results on the ProDG and ViPlan benchmarks.
ROJul 1, 2025
Search-Based Robot Motion Planning With Distance-Based Adaptive Motion PrimitivesBenjamin Kraljusic, Zlatan Ajanovic, Nermin Covic et al.
This work proposes a motion planning algorithm for robotic manipulators that combines sampling-based and search-based planning methods. The core contribution of the proposed approach is the usage of burs of free configuration space (C-space) as adaptive motion primitives within the graph search algorithm. Due to their feature to adaptively expand in free C-space, burs enable more efficient exploration of the configuration space compared to fixed-sized motion primitives, significantly reducing the time to find a valid path and the number of required expansions. The algorithm is implemented within the existing SMPL (Search-Based Motion Planning Library) library and evaluated through a series of different scenarios involving manipulators with varying number of degrees-of-freedom (DoF) and environment complexity. Results demonstrate that the bur-based approach outperforms fixed-primitive planning in complex scenarios, particularly for high DoF manipulators, while achieving comparable performance in simpler scenarios.
ROJun 29, 2025
Mode Collapse Happens: Evaluating Critical Interactions in Joint Trajectory Prediction ModelsMaarten Hugenholtz, Anna Meszaros, Jens Kober et al.
Autonomous Vehicle decisions rely on multimodal prediction models that account for multiple route options and the inherent uncertainty in human behavior. However, models can suffer from mode collapse, where only the most likely mode is predicted, posing significant safety risks. While existing methods employ various strategies to generate diverse predictions, they often overlook the diversity in interaction modes among agents. Additionally, traditional metrics for evaluating prediction models are dataset-dependent and do not evaluate inter-agent interactions quantitatively. To our knowledge, none of the existing metrics explicitly evaluates mode collapse. In this paper, we propose a novel evaluation framework that assesses mode collapse in joint trajectory predictions, focusing on safety-critical interactions. We introduce metrics for mode collapse, mode correctness, and coverage, emphasizing the sequential dimension of predictions. By testing four multi-agent trajectory prediction models, we demonstrate that mode collapse indeed happens. When looking at the sequential dimension, although prediction accuracy improves closer to interaction events, there are still cases where the models are unable to predict the correct interaction mode, even just before the interaction mode becomes inevitable. We hope that our framework can help researchers gain new insights and advance the development of more consistent and accurate prediction models, thus enhancing the safety of autonomous driving systems.
ROJun 28, 2025
Scenario-Based Hierarchical Reinforcement Learning for Automated Driving Decision MakingM. Youssef Abdelhamid, Lennart Vater, Zlatan Ajanovic
Developing decision-making algorithms for highly automated driving systems remains challenging, since these systems have to operate safely in an open and complex environments. Reinforcement Learning (RL) approaches can learn comprehensive decision policies directly from experience and already show promising results in simple driving tasks. However, current approaches fail to achieve generalizability for more complex driving tasks and lack learning efficiency. Therefore, we present Scenario-based Automated Driving Reinforcement Learning (SAD-RL), the first framework that integrates Reinforcement Learning (RL) of hierarchical policy in a scenario-based environment. A high-level policy selects maneuver templates that are evaluated and executed by a low-level control logic. The scenario-based environment allows to control the training experience for the agent and to explicitly introduce challenging, but rate situations into the training process. Our experiments show that an agent trained using the SAD-RL framework can achieve safe behaviour in easy as well as challenging situations efficiently. Our ablation studies confirmed that both HRL and scenario diversity are essential for achieving these results.
ROJul 18, 2019
Search-Based Motion Planning for Performance Autonomous DrivingZlatan Ajanovic, Enrico Regolin, Georg Stettinger et al.
Driving on the limits of vehicle dynamics requires predictive planning of future vehicle states. In this work, a search-based motion planning is used to generate suitable reference trajectories of dynamic vehicle states with the goal to achieve the minimum lap time on slippery roads. The search-based approach enables to explicitly consider a nonlinear vehicle dynamics model as well as constraints on states and inputs so that even challenging scenarios can be achieved in a safe and optimal way. The algorithm performance is evaluated in simulated driving on a track with segments of different curvatures.
LGJun 6, 2019
A novel approach to model exploration for value function learningZlatan Ajanovic, Halil Beglerovic, Bakir Lacevic
Planning and Learning are complementary approaches. Planning relies on deliberative reasoning about the current state and sequence of future reachable states to solve the problem. Learning, on the other hand, is focused on improving system performance based on experience or available data. Learning to improve the performance of planning based on experience in similar, previously solved problems, is ongoing research. One approach is to learn Value function (cost-to-go) which can be used as heuristics for speeding up search-based planning. Existing approaches in this direction use the results of the previous search for learning the heuristics. In this work, we present a search-inspired approach of systematic model exploration for the learning of the value function which does not stop when a plan is available but rather prolongs search such that not only resulting optimal path is used but also extended region around the optimal path. This, in turn, improves both the efficiency and robustness of successive planning. Additionally, the effect of losing admissibility by using ML heuristic is managed by bounding ML with other admissible heuristics.
LGMay 25, 2018
Safe learning-based optimal motion planning for automated drivingZlatan Ajanovic, Bakir Lacevic, Georg Stettinger et al.
This paper presents preliminary work on learning the search heuristic for the optimal motion planning for automated driving in urban traffic. Previous work considered search-based optimal motion planning framework (SBOMP) that utilized numerical or model-based heuristics that did not consider dynamic obstacles. Optimal solution was still guaranteed since dynamic obstacles can only increase the cost. However, significant variations in the search efficiency are observed depending whether dynamic obstacles are present or not. This paper introduces machine learning (ML) based heuristic that takes into account dynamic obstacles, thus adding to the performance consistency for achieving real-time implementation.
ROMar 13, 2018
Search-based optimal motion planning for automated drivingZlatan Ajanovic, Bakir Lacevic, Barys Shyrokau et al.
This paper presents a framework for fast and robust motion planning designed to facilitate automated driving. The framework allows for real-time computation even for horizons of several hundred meters and thus enabling automated driving in urban conditions. This is achieved through several features. Firstly, a convenient geometrical representation of both the search space and driving constraints enables the use of classical path planning approach. Thus, a wide variety of constraints can be tackled simultaneously (other vehicles, traffic lights, etc.). Secondly, an exact cost-to-go map, obtained by solving a relaxed problem, is then used by A*-based algorithm with model predictive flavour in order to compute the optimal motion trajectory. The algorithm takes into account both distance and time horizons. The approach is validated within a simulation study with realistic traffic scenarios. We demonstrate the capability of the algorithm to devise plans both in fast and slow driving conditions, even when full stop is required.
OCDec 11, 2017
A novel model-based heuristic for energy optimal motion planning for automated drivingZlatan Ajanovic, Michael Stolz, Martin Horn
Predictive motion planning is the key to achieve energy-efficient driving, which is one of the main benefits of automated driving. Researchers have been studying the planning of velocity trajectories, a simpler form of motion planning, for over a decade now and many different methods are available. Dynamic programming has shown to be the most common choice due to its numerical background and ability to include nonlinear constraints and models. Although planning of an optimal trajectory is done in a systematic way, dynamic programming does not use any knowledge about the considered problem to guide the exploration and therefore explores all possible trajectories. A* is a search algorithm which enables using knowledge about the problem to guide the exploration to the most promising solutions first. Knowledge has to be represented in a form of a heuristic function, which gives an optimistic estimate of cost for transitioning to the final state, which is not a straightforward task. This paper presents a novel heuristics incorporating air drag and auxiliary power as well as operational costs of the vehicle, besides kinetic and potential energy and rolling resistance known in the literature. Furthermore, optimal cruising velocity, which depends on vehicle aerodynamic properties and auxiliary power, is derived. Results are compared for different variants of heuristic functions and dynamic programming as well.