ROApr 3

QuadAgent: A Responsive Agent System for Vision-Language Guided Quadrotor Agile Flight

Ao Zhuang, Feng Yu, Tianbao Zhang, Linzuo Zhang, Danping Zou

arXiv:2604.0278686.9h-index: 1

AI Analysis

This addresses the challenge of responsive and safe autonomous drone navigation for applications in robotics and AI, representing a novel method rather than an incremental improvement.

The paper tackles the problem of agile quadrotor flight guided by vision-language inputs by introducing QuadAgent, a training-free agent system that decouples high-level reasoning from low-level control, achieving navigation in cluttered indoor spaces at speeds up to 5 m/s.

We present QuadAgent, a training-free agent system for agile quadrotor flight guided by vision-language inputs. Unlike prior end-to-end or serial agent approaches, QuadAgent decouples high-level reasoning from low-level control using an asynchronous multi-agent architecture: Foreground Workflow Agents handle active tasks and user commands, while Background Agents perform look-ahead reasoning. The system maintains scene memory via the Impression Graph, a lightweight topological map built from sparse keyframes, and ensures safe flight with a vision-based obstacle avoidance network. Simulation results show that QuadAgent outperforms baseline methods in efficiency and responsiveness. Real-world experiments demonstrate that it can interpret complex instructions, reason about its surroundings, and navigate cluttered indoor spaces at speeds up to 5 m/s.

View on arXiv PDF

Similar