From Noise to Control: Parameterized Diffusion Policies
For roboticists, PDP provides a way to steer diffusion-based policies precisely without retraining, addressing the need for efficient adaptation in dynamic environments.
PDP enables diffusion policies to be conditioned on parameters in a learned behavior manifold, allowing smooth interpolation and adaptation to new constraints without weight updates. It significantly improves adaptation performance on complex multimodal benchmarks in simulation and real-robot experiments.
We propose Parameterized Diffusion Policy (PDP), a framework for learning diffusion policies conditioned on low-dimensional, continuous parameters embedded in a learned behavior manifold. By constructing this manifold so that distances between latent representations reflect the semantic similarity between physical trajectories, we transform diffusion from a mechanism for stochastic diversity into a precise and optimizable tool for behavior steering. Our approach enables smooth interpolation between known strategies and efficient adaptation to novel constraints without updating policy weights. We demonstrate that PDP significantly improves adaptation performance on complex multimodal benchmarks in both simulated and real-robot experiments compared to standard diffusion policies, particularly in scenarios requiring the synthesis of novel behaviors.