DCARCVLGNINov 22, 2025

AVERY: Adaptive VLM Split Computing through Embodied Self-Awareness for Efficient Disaster Response Systems

arXiv:2511.18151v23 citations
Originality Highly original
AI Analysis

This addresses the challenge of enabling real-time, queryable intelligence on UAVs in low-bandwidth disaster zones, representing a novel method for a known bottleneck rather than a foundational advancement.

The paper tackles the problem of deploying Vision-Language Models (VLMs) on resource-constrained UAVs for disaster response by introducing AVERY, an adaptive split computing framework that separates the VLM into dual streams for real-time awareness and deep analysis, achieving 11.2% higher accuracy than raw image compression and 93.98% lower energy consumption compared to full-edge execution.

Unmanned Aerial Vehicles (UAVs) in disaster response require complex, queryable intelligence that on-board CNNs cannot provide. While Vision-Language Models (VLMs) offer this semantic reasoning, their high resource demands make on-device deployment infeasible, and naive cloud offloading fails under the low-bandwidth networks common in disaster zones. We present AVERY, a framework that enables VLM deployment through adaptive split computing. We advance the split computing paradigm beyond traditional depth-wise partitioning by introducing a functional, cognitive-inspired dual-stream split that separates the VLM into a high-frequency, low-resolution "context stream" for real-time awareness and a low-frequency, high-fidelity "insight stream" for deep analysis. A lightweight, self-aware on-board controller manages this architecture, monitoring network conditions and operator intent to dynamically select from pre-trained compression models, navigating the fundamental accuracy-throughput trade-off. Evaluated using the VLM LISA-7B across an edge-cloud scenario under fluctuating network conditions, AVERY consistently outperforms static configurations, achieving 11.2% higher accuracy than raw image compression and 93.98% lower energy consumption compared to full-edge execution, thereby enhancing mission efficiency and enabling real-time, queryable intelligence on resource-constrained platforms in dynamic environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes