CVOct 7, 2015

Diverse Large-Scale ITS Dataset Created from Continuous Learning for Real-Time Vehicle Detection

Justin A. Eichel, Akshaya Mishra, Nicholas Miller, Nicholas Jankovic, Mohan A. Thomas, Tyler Abbott, Douglas Swanson, Joel Keller

arXiv:1510.02055v11.33 citations

Originality Synthesis-oriented

AI Analysis

This addresses the need for high-quality, diverse datasets in traffic engineering to improve real-time vehicle detection systems, though it is incremental as it builds on existing methods like AdaBoost and Haar-like features.

The paper tackles the problem of poor real-world accuracy in vehicle detection due to limited and non-diverse datasets by creating a large-scale dataset using a cloud-based positive and negative mining process and a distributed learning system, achieving at least 95% accuracy for half the time and about 78% for most of the time on 7.5 million test frames.

In traffic engineering, vehicle detectors are trained on limited datasets resulting in poor accuracy when deployed in real world applications. Annotating large-scale high quality datasets is challenging. Typically, these datasets have limited diversity; they do not reflect the real-world operating environment. There is a need for a large-scale, cloud based positive and negative mining (PNM) process and a large-scale learning and evaluation system for the application of traffic event detection. The proposed positive and negative mining process addresses the quality of crowd sourced ground truth data through machine learning review and human feedback mechanisms. The proposed learning and evaluation system uses a distributed cloud computing framework to handle data-scaling issues associated with large numbers of samples and a high-dimensional feature space. The system is trained using AdaBoost on $1,000,000$ Haar-like features extracted from $70,000$ annotated video frames. The trained real-time vehicle detector achieves an accuracy of at least $95\%$ for $1/2$ and about $78\%$ for $19/20$ of the time when tested on approximately $7,500,000$ video frames. At the end of 2015, the dataset is expect to have over one billion annotated video frames.

View on arXiv PDF

Similar