Optimization of Private Semantic Communication Performance: An Uncooperative Covert Communication Method
This work addresses the problem of secure and efficient semantic communication for users in scenarios with potential attackers, representing an incremental improvement over existing reinforcement learning methods.
The paper tackles the problem of optimizing private semantic communication performance in an uncooperative covert communication scenario, where a server transmits semantic information from images to a user while avoiding detection and eavesdropping by an attacker, with a friendly jammer interfering the attacker; the result is a proposed reinforcement learning algorithm that improves privacy by up to 77.8% and semantic transmission quality by up to 14.3% compared to traditional methods.
In this paper, a novel covert semantic communication framework is investigated. Within this framework, a server extracts and transmits the semantic information, i.e., the meaning of image data, to a user over several time slots. An attacker seeks to detect and eavesdrop the semantic transmission to acquire details of the original image. To avoid data meaning being eavesdropped by an attacker, a friendly jammer is deployed to transmit jamming signals to interfere the attacker so as to hide the transmitted semantic information. Meanwhile, the server will strategically select time slots for semantic information transmission. Due to limited energy, the jammer will not communicate with the server and hence the server does not know the transmit power of the jammer. Therefore, the server must jointly optimize the semantic information transmitted at each time slot and the corresponding transmit power to maximize the privacy and the semantic information transmission quality of the user. To solve this problem, we propose a prioritised sampling assisted twin delayed deep deterministic policy gradient algorithm to jointly determine the transmitted semantic information and the transmit power per time slot without the communications between the server and the jammer. Compared to standard reinforcement learning methods, the propose method uses an additional Q network to estimate Q values such that the agent can select the action with a lower Q value from the two Q networks thus avoiding local optimal action selection and estimation bias of Q values. Simulation results show that the proposed algorithm can improve the privacy and the semantic information transmission quality by up to 77.8% and 14.3% compared to the traditional reinforcement learning methods.