ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
This dataset enables the development and benchmarking of action-aware perception systems for autonomous vehicles, though it is incremental as it extends existing datasets.
The paper introduces ROAD-Waymo, a large-scale dataset for autonomous driving that focuses on understanding agent actions, locations, and events in road scenes, built upon the Waymo Open dataset with 198k annotated video frames and 12.4M labels. It addresses the lack of datasets for training algorithms to comprehend road user actions and includes a novel annotation pipeline to ensure data integrity.
Autonomous Vehicle (AV) perception systems require more than simply seeing, via e.g., object detection or scene segmentation. They need a holistic understanding of what is happening within the scene for safe interaction with other road users. Few datasets exist for the purpose of developing and training algorithms to comprehend the actions of other road users. This paper presents ROAD-Waymo, an extensive dataset for the development and benchmarking of techniques for agent, action, location and event detection in road scenes, provided as a layer upon the (US) Waymo Open dataset. Considerably larger and more challenging than any existing dataset (and encompassing multiple cities), it comes with 198k annotated video frames, 54k agent tubes, 3.9M bounding boxes and a total of 12.4M labels. The integrity of the dataset has been confirmed and enhanced via a novel annotation pipeline designed for automatically identifying violations of requirements specifically designed for this dataset. As ROAD-Waymo is compatible with the original (UK) ROAD dataset, it provides the opportunity to tackle domain adaptation between real-world road scenarios in different countries within a novel benchmark: ROAD++.