LGAIJul 13, 2024

Deep deterministic policy gradient with symmetric data augmentation for lateral attitude tracking control of a fixed-wing aircraft

arXiv:2407.11077v2h-index: 30
AI Analysis

This work addresses sample efficiency in reinforcement learning for fixed-wing aircraft control, representing an incremental improvement with domain-specific applications.

The paper tackled the problem of sample-efficient offline reinforcement learning for aircraft lateral attitude tracking by proposing a symmetric data augmentation method integrated with DDPG, resulting in accelerated policy convergence in flight control simulations.

The symmetry of dynamical systems can be exploited for state-transition prediction and to facilitate control policy optimization. This paper leverages system symmetry to develop sample-efficient offline reinforcement learning (RL) approaches. Under the symmetry assumption for a Markov Decision Process (MDP), a symmetric data augmentation method is proposed. The augmented samples are integrated into the dataset of Deep Deterministic Policy Gradient (DDPG) to enhance its coverage rate of the state-action space. Furthermore, sample utilization efficiency is improved by introducing a second critic trained on the augmented samples, resulting in a dual-critic structure. The aircraft's model is verified to be symmetric, and flight control simulations demonstrate accelerated policy convergence when augmented samples are employed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes