LGSYNAOCJul 23, 2025

ZORMS-LfD: Learning from Demonstrations with Zeroth-Order Random Matrix Search

arXiv:2507.17096v1
Originality Incremental advance
AI Analysis

This addresses a bottleneck in learning from demonstrations for robotics and control systems, offering a more efficient and broadly applicable method, though it appears incremental as it builds on existing optimization approaches.

The paper tackles the problem of learning costs, constraints, and dynamics from expert demonstrations in constrained optimal control without requiring smoothness or gradients, achieving an over 80% reduction in compute time on unconstrained continuous-time benchmarks and outperforming gradient-free methods on constrained ones.

We propose Zeroth-Order Random Matrix Search for Learning from Demonstrations (ZORMS-LfD). ZORMS-LfD enables the costs, constraints, and dynamics of constrained optimal control problems, in both continuous and discrete time, to be learned from expert demonstrations without requiring smoothness of the learning-loss landscape. In contrast, existing state-of-the-art first-order methods require the existence and computation of gradients of the costs, constraints, dynamics, and learning loss with respect to states, controls and/or parameters. Most existing methods are also tailored to discrete time, with constrained problems in continuous time receiving only cursory attention. We demonstrate that ZORMS-LfD matches or surpasses the performance of state-of-the-art methods in terms of both learning loss and compute time across a variety of benchmark problems. On unconstrained continuous-time benchmark problems, ZORMS-LfD achieves similar loss performance to state-of-the-art first-order methods with an over $80$\% reduction in compute time. On constrained continuous-time benchmark problems where there is no specialized state-of-the-art method, ZORMS-LfD is shown to outperform the commonly used gradient-free Nelder-Mead optimization method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes