TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception
This addresses latency issues in multi-agent perception for autonomous vehicles, representing a novel method rather than incremental improvement.
The paper tackles the problem of inter-agent latency in cooperative perception for autonomous vehicles, which causes spatial and semantic misalignments when fusing real-time and delayed data. Their TraF-Align framework predicts feature-level trajectories to reconstruct current-time features, achieving state-of-the-art performance on V2V4Real and DAIR-V2X-Seq datasets.
Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic features, complicating the fusion of real-time observations from the ego vehicle with delayed data from others. To address these issues, we propose TraF-Align, a novel framework that learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle's current time. By generating temporally ordered sampling points along these paths, TraF-Align directs attention from the current-time query to relevant historical features along each trajectory, supporting the reconstruction of current-time features and promoting semantic interaction across multiple frames. This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion and achieving coherent feature fusion. Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.