LGRODec 15, 2025

Constrained Policy Optimization via Sampling-Based Weight-Space Projection

arXiv:2512.13788v1h-index: 1
Originality Highly original
AI Analysis

This work addresses safety-critical learning for autonomous systems by enabling safe policy optimization without relying on constraint gradients, which is incremental as it builds on existing constrained optimization methods.

The paper tackled the problem of learning policies that must satisfy unknown safety constraints during training, proposing a sampling-based weight-space projection method (SCPO) that ensures all intermediate policies remain safe without needing gradient access to constraints. The result showed consistent rejection of unsafe updates, maintained feasibility, and achieved meaningful objective improvement in tasks like constrained control and regression with harmful supervision.

Safety-critical learning requires policies that improve performance without leaving the safe operating regime. We study constrained policy learning where model parameters must satisfy unknown, rollout-based safety constraints. We propose SCPO, a sampling-based weight-space projection method that enforces safety directly in parameter space without requiring gradient access to the constraint functions. Our approach constructs a local safe region by combining trajectory rollouts with smoothness bounds that relate parameter changes to shifts in safety metrics. Each gradient update is then projected via a convex SOCP, producing a safe first-order step. We establish a safe-by-induction guarantee: starting from any safe initialization, all intermediate policies remain safe given feasible projections. In constrained control settings with a stabilizing backup policy, our approach further ensures closed-loop stability and enables safe adaptation beyond the conservative backup. On regression with harmful supervision and a constrained double-integrator task with malicious expert, our approach consistently rejects unsafe updates, maintains feasibility throughout training, and achieves meaningful primal objective improvement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes