CL CV LGMay 31, 2025

EffiVLM-BENCH: A Comprehensive Benchmark for Evaluating Training-Free Acceleration in Large Vision-Language Models

Zekun Wang, Minghua Ma, Zexin Wang, Rongchuan Mu, Liping Shan, Ming Liu, Bing Qin

arXiv:2506.00479v19.64 citationsh-index: 11Has CodeACL

Originality Synthesis-oriented

AI Analysis

This work addresses the computational efficiency problem for researchers and practitioners deploying large vision-language models, but it is incremental as it focuses on benchmarking rather than proposing new acceleration methods.

The paper tackles the lack of comprehensive evaluation for training-free acceleration techniques in large vision-language models by introducing EffiVLM-Bench, a unified benchmark that assesses performance, generalization, and loyalty across diverse backbones and metrics, with open-sourced code to support future research.

Large Vision-Language Models (LVLMs) have achieved remarkable success, yet their significant computational demands hinder practical deployment. While efforts to improve LVLM efficiency are growing, existing methods lack comprehensive evaluation across diverse backbones, benchmarks, and metrics. In this work, we systematically evaluate mainstream acceleration techniques for LVLMs, categorized into token and parameter compression. We introduce EffiVLM-Bench, a unified framework for assessing not only absolute performance but also generalization and loyalty, while exploring Pareto-optimal trade-offs. Our extensive experiments and in-depth analyses offer insights into optimal strategies for accelerating LVLMs. We open-source code and recipes for EffiVLM-Bench to foster future research.

View on arXiv PDF

Similar