Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge
This work addresses the problem of real-time perception for autonomous driving systems, presenting an incremental improvement in detection performance and speed.
The authors tackled real-time 2D object detection for autonomous driving by developing a system based on YOLOX, achieving 41.0 streaming AP on the Argoverse-HD dataset and surpassing second place by 7.8/6.1 points in detection-only and fully tracks, with inference speeds of 30FPS using TensorRT.
In this report, we introduce our real-time 2D object detection system for the realistic autonomous driving scenario. Our detector is built on a newly designed YOLO model, called YOLOX. On the Argoverse-HD dataset, our system achieves 41.0 streaming AP, which surpassed second place by 7.8/6.1 on detection-only track/fully track, respectively. Moreover, equipped with TensorRT, our model achieves the 30FPS inference speed with a high-resolution input size (e.g., 1440-2304). Code and models will be available at https://github.com/Megvii-BaseDetection/YOLOX