59.7DBMar 15
Wheel Dynamic Load Estimation Method Based on Gas Pressure of Hydro-pneumatic SuspensionQijun Liao, Jue Yang, Subhash Rakheja et al.
This paper proposes a novel method to estimate the wheel dynamic load based on the gas pressure of a hydro-pneumatic suspension. A nonlinear coupled model between suspension chamber pressure and tire-ground contact force is developed, integrating suspension dynamics with its nonlinear stiffness characteristics. An iterative algorithm is developed to estimate wheel dynamic load using data from only one single pressure sensor, thereby eliminating the reliance on traditional tire models and complex multi-sensor fusion frameworks. This method effectively reduces hardware redundancy and minimizes the propagation of measurement errors. The proposed model is experimentally validated on a dedicated suspension test bench, demonstrating satisfactory agreement between the measured and estimated data. Additionally, co-simulation with TruckSim verifies the accuracy of both the calculated damping force and wheel dynamic load, demonstrating the effectiveness of the model on characterizing the mechanical behavior of the hydro-pneumatic suspension system. The proposed method provides a practical, low-cost, and efficient solution with minimal hardware dependencies.
40.5LGMay 5
Constraint-Enhanced Reinforcement Learning Based on Dynamic Decoupled Spherical Radial SquashingQijun Liao, Zhaoxin Yu, Jue Yang
When deploying reinforcement learning policies to physical robots, actuator rate constraints -- hard limits on how fast each joint can move per control step -- are unavoidable. These limits vary substantially across joints due to differences in motor inertia, power bandwidth, and transmission stiffness, creating pronounced heterogeneity that existing methods fail to handle geometrically: the per-joint feasible region forms a high-dimensional box in action-increment space, yet QP projection and spherical parameterization methods impose isotropic ball-shaped constraints, exponentially under-covering the true feasible set as heterogeneity grows. This paper proposes Dynamic Decoupled Spherical Radial Squashing (DD-SRad), which resolves this mismatch by computing a position-adaptive radius independently for each actuator, achieving tight alignment with the true per-joint feasible region. DD-SRad satisfies per-step hard constraints with probability~1, preserves well-conditioned gradients throughout training, and admits exact policy gradient backpropagation with zero runtime solver overhead. MuJoCo benchmark experiments demonstrate the highest task return at zero constraint violation -- matching the unconstrained upper bound -- with 30%--50% improvement in constraint-space coverage over spherical baselines. High-fidelity IsaacLab simulations with Unitree H1 and G1 humanoid robots confirm end-to-end optimality parameterized directly from official joint specifications, validating a systematic pathway from hardware datasheets to safe deployment.
19.2LGMar 12
Hybrid Energy-Aware Reward Shaping: A Unified Lightweight Physics-Guided Methodology for Policy OptimizationQijun Liao, Jue Yang, Yiting Kang et al.
Deep reinforcement learning excels in continuous control but often requires extensive exploration, while physics-based models demand complete equations and suffer cubic complexity. This study proposes Hybrid Energy-Aware Reward Shaping (H-EARS), unifying potential-based reward shaping with energy-aware action regularization. H-EARS constrains action magnitude while balancing task-specific and energy-based potentials via functional decomposition, achieving linear complexity O(n) by capturing dominant energy components without full dynamics. We establish a theoretical foundation including: (1) functional independence for separate task/energy optimization; (2) energy-based convergence acceleration; (3) convergence guarantees under function approximation; and (4) approximate potential error bounds. Lyapunov stability connections are analyzed as heuristic guides. Experiments across baselines show improved convergence, stability, and energy efficiency. Vehicle simulations validate applicability in safety-critical domains under extreme conditions. Results confirm that integrating lightweight physics priors enhances model-free RL without complete system models, enabling transfer from lab research to industrial applications.
8.0OCMar 13
Response-Aware Risk-Constrained Control Barrier Function With Application to VehiclesQijun Liao, Jue Yang
This paper proposes a unified control framework based on Response-Aware Risk-Constrained Control Barrier Function for dynamic safety boundary control of vehicles. Addressing the problem of physical model parameter mismatch, the framework constructs an uncertainty propagation model that fuses nominal dynamics priors with direct vehicle body responses. Utilizing simplified single-track dynamics to provide a baseline direction for control gradients and covering model deviations through statistical analysis of body response signals, the framework eliminates the dependence on accurate online estimation of road surface adhesion coefficients. By introducing Conditional Value at Risk (CVaR) theory, the framework reformulates traditional deterministic safety constraints into probabilistic constraints on the tail risk of barrier function derivatives. Combined with a Bayesian online learning mechanism based on inverse Wishart priors, it identifies environmental noise covariance in real-time, adaptively tuning safety margins to reduce performance loss under prior parameter mismatch. Finally, based on Control Lyapunov Function (CLF), a unified Second-Order Cone Programming (SOCP) controller is constructed. Theoretical analysis establishes convergence of Sequential Convex Programming to local Karush-Kuhn-Tucker points and provides per-step probabilistic safety bounds. High-fidelity dynamics simulations demonstrate that under extreme conditions, the method not only eliminates the output divergence phenomenon of traditional methods but also achieves Pareto improvement in both safety and tracking performance. For the chosen risk level, the per-step safety violation probability is theoretically bounded by approximately 2%, validated through high-fidelity simulations showing zero boundary violations across all tested scenarios.