CLAILGJun 12, 2024

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases

arXiv:2406.10290v112 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need for systematic testing tools for mobile-optimized AI models, which is crucial for researchers and developers working on privacy-enhanced and personalized mobile applications, though it is incremental as it builds on existing benchmarking concepts.

The authors tackled the problem of benchmarking Large Language Models (LLMs) and Large Multimodal Models (LMMs) for on-device use by introducing MobileAIBench, a framework that evaluates models across sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices to accelerate mobile AI research and deployment.

The deployment of Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization. However, the hardware constraints of mobile devices necessitate the use of models with fewer parameters and model compression techniques like quantization. Currently, there is limited understanding of quantization's impact on various task performances, including LLM tasks, LMM tasks, and, critically, trust and safety. There is a lack of adequate tools for systematically testing these models on mobile devices. To address these gaps, we introduce MobileAIBench, a comprehensive benchmarking framework for evaluating mobile-optimized LLMs and LMMs. MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices. Our two-part open-source framework includes a library for running evaluations on desktops and an iOS app for on-device latency and hardware utilization measurements. Our thorough analysis aims to accelerate mobile AI research and deployment by providing insights into the performance and feasibility of deploying LLMs and LMMs on mobile platforms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes