When Large Language Model Agents Meet 6G Networks: Perception, Grounding, and Alignment
This work addresses the problem of limited mobile device capacity for LLM agents in 6G networks, enabling more accessible and efficient AI assistant services, though it is incremental as it builds on existing split learning and caching concepts.
The paper tackles the challenge of deploying large language model (LLM) agents on mobile devices in 6G networks by proposing a split learning system that distributes modules across devices and edge servers, resulting in reduced network costs and improved model utilization through a novel caching algorithm.
AI agents based on multimodal large language models (LLMs) are expected to revolutionize human-computer interaction and offer more personalized assistant services across various domains like healthcare, education, manufacturing, and entertainment. Deploying LLM agents in 6G networks enables users to access previously expensive AI assistant services via mobile devices democratically, thereby reducing interaction latency and better preserving user privacy. Nevertheless, the limited capacity of mobile devices constrains the effectiveness of deploying and executing local LLMs, which necessitates offloading complex tasks to global LLMs running on edge servers during long-horizon interactions. In this article, we propose a split learning system for LLM agents in 6G networks leveraging the collaboration between mobile devices and edge servers, where multiple LLMs with different roles are distributed across mobile devices and edge servers to perform user-agent interactive tasks collaboratively. In the proposed system, LLM agents are split into perception, grounding, and alignment modules, facilitating inter-module communications to meet extended user requirements on 6G network functions, including integrated sensing and communication, digital twins, and task-oriented communications. Furthermore, we introduce a novel model caching algorithm for LLMs within the proposed system to improve model utilization in context, thus reducing network costs of the collaborative mobile and edge LLM agents.