53.8LGMay 27
SPAR: Support-Preserving Action RectificationJiaxin Zhao, Weihang Pan, Xun Liang et al.
Offline policy improvement faces an inherent conflict between maximizing value and fitting the data distribution. While in-sample weighted regression is stable, it suffers from over-conservatism that suppresses high-value actions in the distribution tail; conversely, gradient-based approaches often exhibit a fitting-optimization conflict of gradients, which drives the policy off the data manifold. To address this, we propose Support-Preserving Action Rectification (SPAR), which reframes global learning as a local residual rectification anchored to a frozen pure behavior cloning policy. This framework performs fine-grained fitting and local policy improvement in the residual space, thereby contracting the search space. We further introduce Latent Self-Imitation, utilizing a latent-sampling weighted-regression mechanism to address fitting-improvement gradient conflict in the residual space. Theoretically, we prove this mechanism eliminates the manifold-normal drift of standard value gradients, while extensive D4RL experiments show SPAR extracts significant gains from suboptimal baselines to achieve state-of-the-art performance.
RODec 24, 2025
Safe Path Planning and Observation Quality Enhancement Strategy for Unmanned Aerial Vehicles in Water Quality Monitoring TasksYuanshuang Fu, Qianyao Wang, Qihao Wang et al.
Unmanned Aerial Vehicle (UAV) spectral remote sensing technology is widely used in water quality monitoring. However, in dynamic environments, varying illumination conditions, such as shadows and specular reflection (sun glint), can cause severe spectral distortion, thereby reducing data availability. To maximize the acquisition of high-quality data while ensuring flight safety, this paper proposes an active path planning method for dynamic light and shadow disturbance avoidance. First, a dynamic prediction model is constructed to transform the time-varying light and shadow disturbance areas into three-dimensional virtual obstacles. Second, an improved Interfered Fluid Dynamical System (IFDS) algorithm is introduced, which generates a smooth initial obstacle avoidance path by building a repulsive force field. Subsequently, a Model Predictive Control (MPC) framework is employed for rolling-horizon path optimization to handle flight dynamics constraints and achieve real-time trajectory tracking. Furthermore, a Dynamic Flight Altitude Adjustment (DFAA) mechanism is designed to actively reduce the flight altitude when the observable area is narrow, thereby enhancing spatial resolution. Simulation results show that, compared with traditional PID and single obstacle avoidance algorithms, the proposed method achieves an obstacle avoidance success rate of 98% in densely disturbed scenarios, significantly improves path smoothness, and increases the volume of effective observation data by approximately 27%. This research provides an effective engineering solution for precise UAV water quality monitoring in complex illumination environments.