CVIVDec 12, 2025

Embodied Image Compression

arXiv:2512.11612v1h-index: 31
Originality Incremental advance
AI Analysis

This addresses communication constraints for embodied AI in multi-agent systems, enabling real-time task execution, though it is incremental as it builds on prior ICM work.

The paper tackles the problem of image compression for embodied AI agents in real-world environments, introducing the EmbodiedComp benchmark and showing that existing vision-language-action models fail at simple tasks below a critical bitrate threshold.

Image Compression for Machines (ICM) has emerged as a pivotal research direction in the field of visual data compression. However, with the rapid evolution of machine intelligence, the target of compression has shifted from task-specific virtual models to Embodied agents operating in real-world environments. To address the communication constraints of Embodied AI in multi-agent systems and ensure real-time task execution, this paper introduces, for the first time, the scientific problem of Embodied Image Compression. We establish a standardized benchmark, EmbodiedComp, to facilitate systematic evaluation under ultra-low bitrate conditions in a closed-loop setting. Through extensive empirical studies in both simulated and real-world settings, we demonstrate that existing Vision-Language-Action models (VLAs) fail to reliably perform even simple manipulation tasks when compressed below the Embodied bitrate threshold. We anticipate that EmbodiedComp will catalyze the development of domain-specific compression tailored for Embodied agents , thereby accelerating the Embodied AI deployment in the Real-world.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes