QUEST: Query Stream for Practical Cooperative Perception
This work addresses the need for improved cooperative perception in practical applications like vehicle-infrastructure systems, representing an incremental advancement over existing paradigms.
The paper tackles the problem of cooperative perception by introducing a query cooperation paradigm that enables interpretable instance-level flexible feature interaction, and demonstrates its effectiveness on the DAIR-V2X-Seq dataset with advantages in transmission flexibility and robustness to packet dropout.
Cooperative perception can effectively enhance individual perception performance by providing additional viewpoint and expanding the sensing field. Existing cooperation paradigms are either interpretable (result cooperation) or flexible (feature cooperation). In this paper, we propose the concept of query cooperation to enable interpretable instance-level flexible feature interaction. To specifically explain the concept, we propose a cooperative perception framework, termed QUEST, which let query stream flow among agents. The cross-agent queries are interacted via fusion for co-aware instances and complementation for individual unaware instances. Taking camera-based vehicle-infrastructure perception as a typical practical application scene, the experimental results on the real-world dataset, DAIR-V2X-Seq, demonstrate the effectiveness of QUEST and further reveal the advantage of the query cooperation paradigm on transmission flexibility and robustness to packet dropout. We hope our work can further facilitate the cross-agent representation interaction for better cooperative perception in practice.