ROFeb 10, 2021

PLGRIM: Hierarchical Value Learning for Large-scale Exploration in Unknown Environments

arXiv:2102.05633v254 citations
AI Analysis

This work addresses scalable exploration for autonomous robots in hazardous environments, representing an incremental improvement over existing methods.

The paper tackles the computational challenge of belief space planning for autonomous robot exploration in large unknown environments by proposing PLGRIM, a hierarchical value learning framework that bridges local risk-aware resiliency and global reward-seeking objectives, validated with simulations and physical robots in Martian-analog lava tubes.

In order for an autonomous robot to efficiently explore an unknown environment, it must account for uncertainty in sensor measurements, hazard assessment, localization, and motion execution. Making decisions for maximal reward in a stochastic setting requires value learning and policy construction over a belief space, i.e., probability distribution over all possible robot-world states. However, belief space planning in a large spatial environment over long temporal horizons suffers from severe computational challenges. Moreover, constructed policies must safely adapt to unexpected changes in the belief at runtime. This work proposes a scalable value learning framework, PLGRIM (Probabilistic Local and Global Reasoning on Information roadMaps), that bridges the gap between (i) local, risk-aware resiliency and (ii) global, reward-seeking mission objectives. Leveraging hierarchical belief space planners with information-rich graph structures, PLGRIM addresses large-scale exploration problems while providing locally near-optimal coverage plans. We validate our proposed framework with high-fidelity dynamic simulations in diverse environments and on physical robots in Martian-analog lava tubes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes