Contrastive Language-Image Pre-Training Model based Semantic Communication Performance Optimization
This work addresses performance optimization for semantic communication systems in wireless networks, representing an incremental improvement by integrating CLIP with RL for resource allocation.
The paper tackles the problem of optimizing semantic communication performance in noisy wireless networks by designing a CLIP model-based framework that eliminates joint training requirements, and uses a PPO-based RL algorithm to jointly optimize model architecture and resource allocation, achieving up to 40% faster convergence and 4x higher accumulated reward compared to soft actor-critic.
In this paper, a novel contrastive language-image pre-training (CLIP) model based semantic communication framework is designed. Compared to standard neural network (e.g.,convolutional neural network) based semantic encoders and decoders that require joint training over a common dataset, our CLIP model based method does not require any training procedures thus enabling a transmitter to extract data meanings of the original data without neural network model training, and the receiver to train a neural network for follow-up task implementation without the communications with the transmitter. Next, we investigate the deployment of the CLIP model based semantic framework over a noisy wireless network. Since the semantic information generated by the CLIP model is susceptible to wireless noise and the spectrum used for semantic information transmission is limited, it is necessary to jointly optimize CLIP model architecture and spectrum resource block (RB) allocation to maximize semantic communication performance while considering wireless noise, the delay and energy used for semantic communication. To achieve this goal, we use a proximal policy optimization (PPO) based reinforcement learning (RL) algorithm to learn how wireless noise affect the semantic communication performance thus finding optimal CLIP model and RB for each user. Simulation results show that our proposed method improves the convergence rate by up to 40%, and the accumulated reward by 4x compared to soft actor-critic.