High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model
This work addresses the challenge of efficient and controllable image synthesis from WiFi data for applications in sensing and monitoring, representing an incremental improvement over prior methods.
The paper tackles the problem of generating high-resolution images from WiFi CSI measurements by introducing LatentCSI, which uses a pretrained latent diffusion model to map CSI amplitudes into a latent space and then decode them into images, achieving improved computational efficiency and perceptual quality over baselines.
We present LatentCSI, a novel method for generating images of the physical environment from WiFi CSI measurements that leverages a pretrained latent diffusion model (LDM). Unlike prior approaches that rely on complex and computationally intensive techniques such as GANs, our method employs a lightweight neural network to map CSI amplitudes directly into the latent space of an LDM. We then apply the LDM's denoising diffusion model to the latent representation with text-based guidance before decoding using the LDM's pretrained decoder to obtain a high-resolution image. This design bypasses the challenges of pixel-space image generation and avoids the explicit image encoding stage typically required in conventional image-to-image pipelines, enabling efficient and high-quality image synthesis. We validate our approach on two datasets: a wide-band CSI dataset we collected with off-the-shelf WiFi devices and cameras; and a subset of the publicly available MM-Fi dataset. The results demonstrate that LatentCSI outperforms baselines of comparable complexity trained directly on ground-truth images in both computational efficiency and perceptual quality, while additionally providing practical advantages through its unique capacity for text-guided controllability.