LG AI SYJun 2, 2025

Bregman Centroid Guided Cross-Entropy Method

Yuliang Gu, Hongpeng Cao, Marco Caccamo, Naira Hovakimyan

arXiv:2506.02205v24.1h-index: 54

Originality Highly original

AI Analysis

This provides a lightweight enhancement for trajectory optimization in model-based reinforcement learning, addressing a known bottleneck in multimodal landscapes.

The paper tackled the problem of premature convergence in the Cross-Entropy Method (CEM) for model-based reinforcement learning by proposing Bregman Centroid Guided CEM, which improved convergence and solution quality in synthetic benchmarks, cluttered navigation, and full MBRL pipelines.

The Cross-Entropy Method (CEM) is a widely adopted trajectory optimizer in model-based reinforcement learning (MBRL), but its unimodal sampling strategy often leads to premature convergence in multimodal landscapes. In this work, we propose Bregman Centroid Guided CEM ($\mathcal{BC}$-EvoCEM), a lightweight enhancement to ensemble CEM that leverages $\textit{Bregman centroids}$ for principled information aggregation and diversity control. $\textbf{$\mathcal{BC}$-EvoCEM}$ computes a performance-weighted Bregman centroid across CEM workers and updates the least contributing ones by sampling within a trust region around the centroid. Leveraging the duality between Bregman divergences and exponential family distributions, we show that $\textbf{$\mathcal{BC}$-EvoCEM}$ integrates seamlessly into standard CEM pipelines with negligible overhead. Empirical results on synthetic benchmarks, a cluttered navigation task, and full MBRL pipelines demonstrate that $\textbf{$\mathcal{BC}$-EvoCEM}$ enhances both convergence and solution quality, providing a simple yet effective upgrade for CEM.

View on arXiv PDF

Similar