Deep Reinforcement Learning for Cognitive Time-Division Joint SAR and Secure Communications

Mohamed-Amine Lahmeri, Ata Khalili, Yujiao Liu, Anke Schmeink, Robert Schober

arXiv:2604.0997810.5h-index: 21

Predicted impact top 83% in IT · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses secure communication in aerial networks with mobile eavesdroppers, a critical scenario for surveillance and post-disaster communication.

The paper proposes a deep reinforcement learning-based dynamic time-division joint SAR and communication framework for secure aerial communications, achieving higher worst-case secrecy rates than baseline schemes and generalizing to unseen eavesdropper motion patterns.

Synthetic aperture radar (SAR) imaging can be exploited to enhance wireless communication performance through high-precision environmental awareness. However, integrating sensing and communication functionalities in such wideband systems remains challenging, motivating the development of a joint SAR and communication (JSARC) framework. We propose a dynamic time-division JSARC (TD-JSARC) framework for secure aerial communications that is relevant for critical scenarios, such as surveillance or post-disaster communication, where conventional localization of mobile adversaries often fails. In particular, we consider a secure downlink communication scenario where an aerial base station (ABS) serves a ground user (UE) in the presence of a ground-moving eavesdropper. To detect and track the eavesdropper, the ABS uses cognitive SAR along-track interferometry (ATI) to estimate its position and velocity. Based on these estimates, the ABS applies adaptive beamforming and artificial-noise jamming to enhance secrecy. To this end, we jointly optimize the time and power allocation to maximize the worst-case secrecy rate, while satisfying both SAR and communication constraints. Using the estimated eavesdropper trajectory, we formulate the problem as a Markov decision process (MDP) and solve it via deep reinforcement learning (DRL). Simulation results show that the proposed learning-based approach outperforms both learning and non-learning baseline schemes employing equal-aperture and random time allocation. The proposed method also generalizes well to previously unseen eavesdropper motion patterns.

View on arXiv PDF

Similar