ROAIMar 28, 2022

Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning

arXiv:2203.14749v225 citationsh-index: 34
AI Analysis

This addresses safety-critical navigation for drones in unknown, cluttered settings, representing an incremental improvement in risk-aware decision-making.

The paper tackled the problem of enabling nano drones to navigate cluttered environments safely by learning adaptive risk-tendency policies using distributional reinforcement learning, resulting in superior performance compared to risk-neutral or risk-averse baselines in simulations and real-world tests.

Enabling the capability of assessing risk and making risk-aware decisions is essential to applying reinforcement learning to safety-critical robots like drones. In this paper, we investigate a specific case where a nano quadcopter robot learns to navigate an apriori-unknown cluttered environment under partial observability. We present a distributional reinforcement learning framework to generate adaptive risk-tendency policies. Specifically, we propose to use lower tail conditional variance of the learnt return distribution as intrinsic uncertainty estimation, and use exponentially weighted average forecasting (EWAF) to adapt the risk-tendency in accordance with the estimated uncertainty. In simulation and real-world empirical results, we show that (1) the most effective risk-tendency vary across states, (2) the agent with adaptive risk-tendency achieves superior performance compared to risk-neutral policy or risk-averse policy baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes