LG AIJul 26, 2024

Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints

Lei Guo, Wei Chen, Yuxuan Sun, Bo Ai, Nikolaos Pappas, Tony Q. S. Quek

arXiv:2407.18468v419.347 citationsh-index: 60Has Code

Originality Incremental advance

AI Analysis

This work addresses bandwidth limitations for diffusion-based generative models in wireless communication, representing an incremental advancement in semantic communication.

This paper tackles the problem of applying diffusion models in wireless communication under bandwidth constraints by introducing a diffusion-driven semantic communication framework with VAE-based compression, resulting in significant improvements in pixel-level and semantic metrics compared to deep joint source-channel coding.

Diffusion models have been extensively utilized in AI-generated content (AIGC) in recent years, thanks to the superior generation capabilities. Combining with semantic communications, diffusion models are used for tasks such as denoising, data reconstruction, and content generation. However, existing diffusion-based generative models do not consider the stringent bandwidth limitation, which limits its application in wireless communication. This paper introduces a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative model. Our designed architecture utilizes the diffusion model, where the signal transmission process through the wireless channel acts as the forward process in diffusion. To reduce bandwidth requirements, we incorporate a downsampling module and a paired upsampling module based on a variational auto-encoder with reparameterization at the receiver to ensure that the recovered features conform to the Gaussian distribution. Furthermore, we derive the loss function for our proposed system and evaluate its performance through comprehensive experiments. Our experimental results demonstrate significant improvements in pixel-level metrics such as peak signal to noise ratio (PSNR) and semantic metrics like learned perceptual image patch similarity (LPIPS). These enhancements are more profound regarding the compression rates and SNR compared to deep joint source-channel coding (DJSCC). We release the code at https://github.com/import-sudo/Diffusion-Driven-Semantic-Communication.

View on arXiv PDF Code

Similar