SPLGJul 2, 2025

Token Communication in the Era of Large Models: An Information Bottleneck-Based Approach

arXiv:2507.01728v16 citationsh-index: 19IEEE Wireless Communications Letters
Originality Incremental advance
AI Analysis

This work addresses the challenge of scalable and generalizable communication for multimodal understanding and generation in next-generation intelligent systems, representing an incremental advancement by integrating existing methods like information bottlenecks and Transformers.

The paper tackles the problem of inefficient token communication in large models by proposing UniToCom, a unified token communication paradigm that uses a generative information bottleneck principle to learn efficient token representations, resulting in improved communication efficiency and reduced computational complexity validated by simulations.

This letter proposes UniToCom, a unified token communication paradigm that treats tokens as the fundamental units for both processing and wireless transmission. Specifically, to enable efficient token representations, we propose a generative information bottleneck (GenIB) principle, which facilitates the learning of tokens that preserve essential information while supporting reliable generation across multiple modalities. By doing this, GenIB-based tokenization is conducive to improving the communication efficiency and reducing computational complexity. Additionally, we develop $σ$-GenIB to address the challenges of variance collapse in autoregressive modeling, maintaining representational diversity and stability. Moreover, we employ a causal Transformer-based multimodal large language model (MLLM) at the receiver to unify the processing of both discrete and continuous tokens under the next-token prediction paradigm. Simulation results validate the effectiveness and superiority of the proposed UniToCom compared to baselines under dynamic channel conditions. By integrating token processing with MLLMs, UniToCom enables scalable and generalizable communication in favor of multimodal understanding and generation, providing a potential solution for next-generation intelligent communications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes