M2BeamLLM: Multimodal Sensing-empowered mmWave Beam Prediction with Large Language Models
This provides an efficient and intelligent beam prediction solution for vehicle-to-infrastructure mmWave communication systems, though it is incremental as it builds on existing LLM and multimodal fusion techniques.
The paper tackles beam prediction in mmWave communication systems by integrating multimodal sensor data with large language models, achieving significantly higher accuracy and robustness than traditional deep learning models, with performance improving as more sensing modalities are added.
This paper introduces a novel neural network framework called M2BeamLLM for beam prediction in millimeter-wave (mmWave) massive multi-input multi-output (mMIMO) communication systems. M2BeamLLM integrates multi-modal sensor data, including images, radar, LiDAR, and GPS, leveraging the powerful reasoning capabilities of large language models (LLMs) such as GPT-2 for beam prediction. By combining sensing data encoding, multimodal alignment and fusion, and supervised fine-tuning (SFT), M2BeamLLM achieves significantly higher beam prediction accuracy and robustness, demonstrably outperforming traditional deep learning (DL) models in both standard and few-shot scenarios. Furthermore, its prediction performance consistently improves with increased diversity in sensing modalities. Our study provides an efficient and intelligent beam prediction solution for vehicle-to-infrastructure (V2I) mmWave communication systems.