LG AIMar 23, 2025

Adaptive Multi-Fidelity Reinforcement Learning for Variance Reduction in Engineering Design Optimization

arXiv:2503.18229v11 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses variance reduction in engineering design optimization using multi-fidelity RL, offering an incremental improvement over hierarchical methods.

The paper tackles the problem of high variance in policy learning for multi-fidelity reinforcement learning due to heterogeneous error distributions in models, proposing an adaptive framework that uses non-hierarchical low-fidelity models dynamically. The results show substantial variance reduction and improved convergence in an octocopter design optimization, eliminating manual tuning overhead.

Multi-fidelity Reinforcement Learning (RL) frameworks efficiently utilize computational resources by integrating analysis models of varying accuracy and costs. The prevailing methodologies, characterized by transfer learning, human-inspired strategies, control variate techniques, and adaptive sampling, predominantly depend on a structured hierarchy of models. However, this reliance on a model hierarchy can exacerbate variance in policy learning when the underlying models exhibit heterogeneous error distributions across the design space. To address this challenge, this work proposes a novel adaptive multi-fidelity RL framework, in which multiple heterogeneous, non-hierarchical low-fidelity models are dynamically leveraged alongside a high-fidelity model to efficiently learn a high-fidelity policy. Specifically, low-fidelity policies and their experience data are adaptively used for efficient targeted learning, guided by their alignment with the high-fidelity policy. The effectiveness of the approach is demonstrated in an octocopter design optimization problem, utilizing two low-fidelity models alongside a high-fidelity simulator. The results demonstrate that the proposed approach substantially reduces variance in policy learning, leading to improved convergence and consistent high-quality solutions relative to traditional hierarchical multi-fidelity RL methods. Moreover, the framework eliminates the need for manually tuning model usage schedules, which can otherwise introduce significant computational overhead. This positions the framework as an effective variance-reduction strategy for multi-fidelity RL, while also mitigating the computational and operational burden of manual fidelity scheduling.

View on arXiv PDF

Similar