ROJul 7, 2018

Deep-Reinforcement-Learning for Gliding and Perching Bodies

Guido Novati, Lakshminarayanan Mahadevan, Petros Koumoutsakos

arXiv:1807.03671v18.010 citations

Originality Incremental advance

AI Analysis

This work addresses efficient transportation for natural and human-powered fliers, offering a promising framework for mechanical devices in complex flow environments, though it is incremental as it applies existing D-RL methods to a new domain.

The paper tackled the problem of identifying optimal gliding and landing strategies for controlled elliptical bodies using deep reinforcement learning, achieving robust gliding with either minimum energy expenditure or fastest arrival time, and found it more robust than model-based control with modest computational cost.

Controlled gliding is one of the most energetically efficient modes of transportation for natural and human powered fliers. Here we demonstrate that gliding and landing strategies with different optimality criteria can be identified through deep reinforcement learning without explicit knowledge of the underlying physics. We combine a two dimensional model of a controlled elliptical body with deep reinforcement learning (D-RL) to achieve gliding with either minimum energy expenditure, or fastest time of arrival, at a predetermined location. In both cases the gliding trajectories are smooth, although energy/time optimal strategies are distinguished by small/high frequency actuations. We examine the effects of the ellipse's shape and weight on the optimal policies for controlled gliding. Surprisingly, we find that the model-free reinforcement learning leads to more robust gliding than model-based optimal control strategies with a modest additional computational cost. We also demonstrate that the gliders with D-RL can generalize their strategies to reach the target location from previously unseen starting positions. The model-free character and robustness of D-RL suggests a promising framework for developing mechanical devices capable of exploiting complex flow environments.

View on arXiv PDF

Similar