RO CLSep 20, 2024

Selective Exploration and Information Gathering in Search and Rescue Using Hierarchical Learning Guided by Natural Language Input

Dimitrios Panagopoulos, Adolfo Perrusquia, Weisi Guo

arXiv:2409.13445v17.18 citationsh-index: 11Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of time-constrained search and rescue operations for emergency responders by enabling robots to incorporate human intelligence, though it is incremental as it builds on existing HRL and LLM methods.

The paper tackles the problem of inefficient search and rescue operations by robots in vast, transformed environments by integrating large language models with hierarchical reinforcement learning to translate human verbal inputs into actionable insights, resulting in significantly improved learning efficiency and decision-making.

In recent years, robots and autonomous systems have become increasingly integral to our daily lives, offering solutions to complex problems across various domains. Their application in search and rescue (SAR) operations, however, presents unique challenges. Comprehensively exploring the disaster-stricken area is often infeasible due to the vastness of the terrain, transformed environment, and the time constraints involved. Traditional robotic systems typically operate on predefined search patterns and lack the ability to incorporate and exploit ground truths provided by human stakeholders, which can be the key to speeding up the learning process and enhancing triage. Addressing this gap, we introduce a system that integrates social interaction via large language models (LLMs) with a hierarchical reinforcement learning (HRL) framework. The proposed system is designed to translate verbal inputs from human stakeholders into actionable RL insights and adjust its search strategy. By leveraging human-provided information through LLMs and structuring task execution through HRL, our approach not only bridges the gap between autonomous capabilities and human intelligence but also significantly improves the agent's learning efficiency and decision-making process in environments characterised by long horizons and sparse rewards.

View on arXiv PDF Code

Similar