AI LG LO SYJul 31, 2025

Hyperproperty-Constrained Secure Reinforcement Learning

Ernest Bonnah, Luan Viet Nguyen, Khaza Anuarul Hoque

arXiv:2508.00106v13.3h-index: 15MEMOCODE

Originality Incremental advance

AI Analysis

This addresses security constraints in reinforcement learning for robotics applications, representing an incremental advance by applying hyperproperties to a known bottleneck in safe RL.

The paper tackles the problem of security-aware reinforcement learning by incorporating HyperTWTL constraints to represent security and opacity properties, demonstrating through a robotic case study that their proposed dynamic Boltzmann softmax RL approach outperforms two baseline algorithms.

Hyperproperties for Time Window Temporal Logic (HyperTWTL) is a domain-specific formal specification language known for its effectiveness in compactly representing security, opacity, and concurrency properties for robotics applications. This paper focuses on HyperTWTL-constrained secure reinforcement learning (SecRL). Although temporal logic-constrained safe reinforcement learning (SRL) is an evolving research problem with several existing literature, there is a significant research gap in exploring security-aware reinforcement learning (RL) using hyperproperties. Given the dynamics of an agent as a Markov Decision Process (MDP) and opacity/security constraints formalized as HyperTWTL, we propose an approach for learning security-aware optimal policies using dynamic Boltzmann softmax RL while satisfying the HyperTWTL constraints. The effectiveness and scalability of our proposed approach are demonstrated using a pick-up and delivery robotic mission case study. We also compare our results with two other baseline RL algorithms, showing that our proposed method outperforms them.

View on arXiv PDF

Similar