IVCVSPMar 21, 2025

Vision Transformer Based Semantic Communications for Next Generation Wireless Networks

arXiv:2503.17275v17 citationsh-index: 192025 IEEE International Conference on Communications Workshops (ICC Workshops)
Originality Synthesis-oriented
AI Analysis

This work addresses bandwidth efficiency for next-generation wireless networks, representing an incremental improvement by applying an existing method (Vision Transformer) to a new domain (semantic communications).

This paper tackles the problem of high-bandwidth image transmission in 6G networks by proposing a Vision Transformer-based semantic communication framework that prioritizes semantic meaning over raw data accuracy, achieving a Peak Signal-to-Noise Ratio of 38 dB and outperforming CNN and GAN approaches.

In the evolving landscape of 6G networks, semantic communications are poised to revolutionize data transmission by prioritizing the transmission of semantic meaning over raw data accuracy. This paper presents a Vision Transformer (ViT)-based semantic communication framework that has been deliberately designed to achieve high semantic similarity during image transmission while simultaneously minimizing the demand for bandwidth. By equipping ViT as the encoder-decoder framework, the proposed architecture can proficiently encode images into a high semantic content at the transmitter and precisely reconstruct the images, considering real-world fading and noise consideration at the receiver. Building on the attention mechanisms inherent to ViTs, our model outperforms Convolution Neural Network (CNNs) and Generative Adversarial Networks (GANs) tailored for generating such images. The architecture based on the proposed ViT network achieves the Peak Signal-to-noise Ratio (PSNR) of 38 dB, which is higher than other Deep Learning (DL) approaches in maintaining semantic similarity across different communication environments. These findings establish our ViT-based approach as a significant breakthrough in semantic communications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes