Stepan Dergachev

h-index2

3papers

14citations

3 Papers

5.5ROJul 11

CoRL-MPPI: Enhancing MPPI With Learnable Behaviours For Efficient And Provably-Safe Multi-Robot Collision Avoidance

Stepan Dergachev, Artem Pshenitsyn, Aleksandr Panov et al.

Decentralized collision avoidance is a core challenge for scalable multi-robot systems. A promising approach to this problem is Model Predictive Path Integral (MPPI) control - a framework that naturally handles arbitrary motion models and provides strong theoretical guarantees. Still, in practice an MPPI-based controller may produce suboptimal trajectories because its performance relies heavily on uninformed random sampling. We introduce CoRL-MPPI, a fusion of Cooperative Reinforcement Learning and MPPI that addresses this limitation. We train an action policy, approximated by a deep neural network, in simulation to learn local cooperative collision-avoidance behaviors. This learned policy is then embedded into the MPPI framework to guide its sampling distribution, biasing it toward more intelligent and cooperative actions in scenarios that may differ substantially from those used during training. Moreover, CoRL-MPPI preserves the theoretical guarantees of regular MPPI. We evaluate our approach in dense, dynamic setups against classical and learning-based state-of-the-art baselines. Our results demonstrate that CoRL-MPPI outperforms competing methods and significantly improves navigation efficiency, measured by success rate and delay, as well as safety, enabling agile and robust multi-robot navigation.

7.0MAMar 25

Relaxing Constraints in Anonymous Multi Agent Path Finding for Large Agents

Stepan Dergachev, Dmitry Avdeev

The study addressed the problem of Anonymous Multi-Agent Path-finding (AMAPF). Unlike the classical formulation, where the assignment of agents to goals is fixed, in the anonymous MAPF setting it is irrelevant which agent reaches specific goal, provided that all goals are occupied. Most existing multi-agent pathfinding algorithms rely on a discrete representation of the environment (e.g., square grids) and do not account for the sizes of agents. This limits their applicability in real-world scenarios, such as trajectory planning for mobile robots in warehouses. Conversely, methods operating in continuous space typically impose substantial restrictions on the input data, such as constraints on the distances between initial and goal positions or between start/goal positions and obstacles. In this work, we considered one of the AMAPF algorithms designed for continuous space, where agents are modeled as disks of equal size. The algorithm requires a strict minimum separation of $4$ agent radii between any start/goal positions. Proposed a modification aimed at relaxing the constraints and reduce this limit from $4$ to $2\sqrt{3}$. We theoretically demonstrated that the proposed enhancements preserve original theoretical properties, including the guarantee that all agents will eventually achieve their goals safely and without collisions.

2.3MAAug 3, 2020Code

A Combination of Theta*, ORCA and Push and Rotate for Multi-agent Navigation

Stepan Dergachev, Konstantin Yakovlev, Ryhor Prakapovich

We study the problem of multi-agent navigation in static environments when no centralized controller is present. Each agent is controlled individually and relies on three algorithmic components to achieve its goal while avoiding collisions with the other agents and the obstacles: i) individual path planning which is done by Theta* algorithm; ii) collision avoidance while path following which is performed by ORCA* algorithm; iii) locally-confined multi-agent path planning done by Push and Rotate algorithm. The latter component is crucial to avoid deadlocks in confined areas, such as narrow passages or doors. We describe how the suggested components interact and form a coherent navigation pipeline. We carry out an extensive empirical evaluation of this pipeline in simulation. The obtained results clearly demonstrate that the number of occurring deadlocks significantly decreases enabling more agents to reach their goals compared to techniques that rely on collision-avoidance only and do not include multi-agent path planning component