Maximum Entropy Differential Dynamic Programming
This work addresses local minima issues in optimal control for robotics or related domains, but it is incremental as it builds on existing Differential Dynamic Programming and maximum entropy frameworks.
The authors tackled the problem of local minima in Differential Dynamic Programming by introducing a maximum entropy formulation with unimodal and multimodal value functions, enabling exploration to escape local minima and demonstrating improved performance on tasks with multiple local minima compared to vanilla Differential Dynamic Programming.
In this paper, we present a novel maximum entropy formulation of the Differential Dynamic Programming algorithm and derive two variants using unimodal and multimodal value functions parameterizations. By combining the maximum entropy Bellman equations with a particular approximation of the cost function, we are able to obtain a new formulation of Differential Dynamic Programming which is able to escape from local minima via exploration with a multimodal policy. To demonstrate the efficacy of the proposed algorithm, we provide experimental results using four systems on tasks that are represented by cost functions with multiple local minima and compare them against vanilla Differential Dynamic Programming. Furthermore, we discuss connections with previous work on the linearly solvable stochastic control framework and its extensions in relation to compositionality.