LGDATA-ANApr 24, 2023

Parallel bootstrap-based on-policy deep reinforcement learning for continuous flow control applications

arXiv:2304.12330v33 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses a computational bottleneck for researchers in fluid dynamics and reinforcement learning, enabling faster and more stable control model training, though it is incremental as it builds on existing on-policy methods.

The paper tackled the challenge of efficiently using parallel environments in on-policy deep reinforcement learning for flow control, which can break theoretical assumptions, by proposing a parallelism pattern with partial-trajectory buffers and return bootstrapping, achieving flexible parallelism while preserving on-policiness in a CPU-intensive continuous flow control problem.

The coupling of deep reinforcement learning to numerical flow control problems has recently received a considerable attention, leading to groundbreaking results and opening new perspectives for the domain. Due to the usually high computational cost of fluid dynamics solvers, the use of parallel environments during the learning process represents an essential ingredient to attain efficient control in a reasonable time. Yet, most of the deep reinforcement learning literature for flow control relies on on-policy algorithms, for which the massively parallel transition collection may break theoretical assumptions and lead to suboptimal control models. To overcome this issue, we propose a parallelism pattern relying on partial-trajectory buffers terminated by a return bootstrapping step, allowing a flexible use of parallel environments while preserving the on-policiness of the updates. This approach is illustrated on a CPU-intensive continuous flow control problem from the literature.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes