LG AIFeb 17, 2025

Connector-S: A Survey of Connectors in Multi-modal Large Language Models

Xun Zhu, Zheng Zhang, Xi Chen, Yiming Shi, Miao Li, Ji Wu

arXiv:2502.11453v116.97 citationsh-index: 5IJCAI

Originality Synthesis-oriented

AI Analysis

This survey addresses the problem of understanding and optimizing connectors in MLLMs for researchers, but it is incremental as it synthesizes existing knowledge rather than introducing new methods or data.

The paper tackles the lack of comprehensive analysis of connectors in multi-modal large language models (MLLMs) by conducting a systematic survey that reviews current progress, presents a structured taxonomy, and discusses research frontiers, resulting in a foundational reference and roadmap for researchers.

With the rapid advancements in multi-modal large language models (MLLMs), connectors play a pivotal role in bridging diverse modalities and enhancing model performance. However, the design and evolution of connectors have not been comprehensively analyzed, leaving gaps in understanding how these components function and hindering the development of more powerful connectors. In this survey, we systematically review the current progress of connectors in MLLMs and present a structured taxonomy that categorizes connectors into atomic operations (mapping, compression, mixture of experts) and holistic designs (multi-layer, multi-encoder, multi-modal scenarios), highlighting their technical contributions and advancements. Furthermore, we discuss several promising research frontiers and challenges, including high-resolution input, dynamic compression, guide information selection, combination strategy, and interpretability. This survey is intended to serve as a foundational reference and a clear roadmap for researchers, providing valuable insights into the design and optimization of next-generation connectors to enhance the performance and adaptability of MLLMs.

View on arXiv PDF

Similar