CV CLOct 21, 2025

The Impact of Image Resolution on Biomedical Multimodal Large Language Models

Liangyu Chen, James Burgess, Jeffrey J Nirschl, Orr Zohar, Serena Yeung-Levy

arXiv:2510.18304v16.21 citationsh-index: 19

Originality Incremental advance

AI Analysis

This work addresses the problem of information loss in biomedical image analysis for researchers and clinicians, offering incremental improvements to optimize MLLMs for high-resolution medical imaging.

The study investigated how image resolution affects biomedical multimodal large language models (MLLMs), finding that native-resolution training and inference significantly improve performance across tasks, while misalignment between resolutions degrades it, and mixed-resolution training effectively balances computational constraints with performance.

Imaging technologies are fundamental to biomedical research and modern medicine, requiring analysis of high-resolution images across various modalities. While multimodal large language models (MLLMs) show promise for biomedical image analysis, most are designed for low-resolution images from general-purpose datasets, risking critical information loss. We investigate how image resolution affects MLLM performance in biomedical applications and demonstrate that: (1) native-resolution training and inference significantly improve performance across multiple tasks, (2) misalignment between training and inference resolutions severely degrades performance, and (3) mixed-resolution training effectively mitigates misalignment and balances computational constraints with performance requirements. Based on these findings, we recommend prioritizing native-resolution inference and mixed-resolution datasets to optimize biomedical MLLMs for transformative impact in scientific research and clinical applications.

View on arXiv PDF

Similar