88.6NAMay 6
Analysis of gradient flow for computing defocusing action ground states of rotating nonlinear Schrödinger equationsWei Liu, Tingfeng Wang, Yongjun Yuan et al.
This work focuses on the numerical computation of defocusing action ground states for rotating nonlinear Schrödinger equations (RNLS) using a direct gradient flow (DGF) method. We address theoretical gaps in the existing literature concerning the stability and convergence of this DGF scheme. Firstly, we prove the unconditional stability of the DGF scheme, demonstrating that the action functional is monotonically non-increasing along the discrete flow for arbitrary time step sizes. Secondly, we establish a rigorous convergence analysis, proving global convergence under minor assumptions and local exponential convergence to the action ground state under a reasonable non-degeneracy condition. The analysis relies on the uniform boundedness of sublevel sets of the action functional and introduces a tailored $H^1$-distance between phase-shift equivalence classes to handle complex-valued ground states with quantized vortices. A novel analytical framework is also developed to establish the exponential convergence rate. Numerical experiments are presented to validate the theoretical findings, demonstrating both the global migration towards a neighborhood of the ground state and subsequent exponential convergence.
CRNov 17, 2025
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy OptimizationXuankun Rong, Wenke Huang, Tingfeng Wang et al.
Multimodal large language models (MLLMs) have demonstrated impressive reasoning and instruction-following capabilities, yet their expanded modality space introduces new compositional safety risks that emerge from complex text-image interactions. Such cross-modal couplings can produce unsafe semantics even when individual inputs are benign, exposing the fragile safety awareness of current MLLMs. While recent works enhance safety by guiding models to reason about potential risks, unregulated reasoning traces may compromise alignment; although Group Relative Policy Optimization (GRPO) offers self-rewarded refinement without human supervision, it lacks verifiable signals for reasoning safety. To address this, we propose SafeGRPO a self-rewarded multimodal safety alignment framework that integrates rule-governed reward construction into GRPO, enabling interpretable and verifiable optimization of reasoning safety. Built upon the constructed SafeTag-VL-3K dataset with explicit visual, textual, and combined safety tags, SafeGRPO performs step-guided safety thinking to enforce structured reasoning and behavior alignment, substantially improving multimodal safety awareness, compositional robustness, and reasoning stability across diverse benchmarks without sacrificing general capabilities.