Shape Your Body: Value Gradients for Multi-Embodiment Robot Design
For robot designers, this method reduces the computational cost of co-design by reusing a single value function across multiple embodiments, enabling efficient optimization and analysis.
The paper proposes using pre-trained embodiment-aware value functions as differentiable surrogates to optimize robot designs via value gradients, eliminating the need for per-robot co-design loops. The approach is evaluated on up to 50 robots with over 1100 continuous parameters, showing effectiveness in optimizing and analyzing designs.
We propose to turn generalist multi-embodiment value functions into reusable models for robot design. Instead of running a new reinforcement learning co-design loop for each robot, we first train an embodiment-aware policy and value function across many robot designs. After training, the frozen value function is used as a differentiable surrogate to optimize candidate embodiments through value gradients. We evaluate our approach across different robot design settings, from perturbed single robots to held-out robots across morphology classes, with single models trained on up to 50 robots and design spaces of over 1100 continuous embodiment parameters. Beyond optimizing complete embodiments, we show that value gradients can identify performance-limiting design and control parameters, enabling both the optimization and the analysis of new robot designs.