BIO-PHLGApr 26, 2024

Q-learning with temporal memory to navigate turbulence

arXiv:2404.17495v28 citationsh-index: 53eLife
Originality Incremental advance
AI Analysis

This addresses the challenge of odor-based navigation for autonomous agents in turbulent conditions, representing an incremental improvement with domain-specific applications.

The paper tackled the problem of olfactory searches in turbulent environments using reinforcement learning with a temporal memory, demonstrating that agents can learn robust navigation to a target with an optimal strategy involving cross-wind casting, similar to insect behavior.

We consider the problem of olfactory searches in a turbulent environment. We focus on agents that respond solely to odor stimuli, with no access to spatial perception nor prior information about the odor. We ask whether navigation to a target can be learned robustly within a sequential decision making framework. We develop a reinforcement learning algorithm using a small set of interpretable olfactory states and train it with realistic turbulent odor cues. By introducing a temporal memory, we demonstrate that two salient features of odor traces, discretized in few olfactory states, are sufficient to learn navigation in a realistic odor plume. Performance is dictated by the sparse nature of turbulent odors. An optimal memory exists which ignores blanks within the plume and activates a recovery strategy outside the plume. We obtain the best performance by letting agents learn their recovery strategy and show that it is mostly casting cross wind, similar to behavior observed in flying insects. The optimal strategy is robust to substantial changes in the odor plumes, suggesting minor parameter tuning may be sufficient to adapt to different environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes