LGDec 17, 2021

A data-centric weak supervised learning for highway traffic incident detection

Yixuan Sun, Tanwi Mallick, Prasanna Balaprakash, Jane Macfarlane

arXiv:2112.09792v23.116 citations

Originality Incremental advance

AI Analysis

This work addresses traffic congestion issues for highway management by reducing false alarms in incident detection, though it is incremental as it builds on existing weak supervision methods.

The paper tackles the problem of high false alarm rates in highway traffic incident detection by developing a data-centric weak supervised learning workflow that generates high-quality training labels without ground truth, resulting in a high detection rate of 0.90 and a low false alarm rate of 0.08.

Using the data from loop detector sensors for near-real-time detection of traffic incidents in highways is crucial to averting major traffic congestion. While recent supervised machine learning methods offer solutions to incident detection by leveraging human-labeled incident data, the false alarm rate is often too high to be used in practice. Specifically, the inconsistency in the human labeling of the incidents significantly affects the performance of supervised learning models. To that end, we focus on a data-centric approach to improve the accuracy and reduce the false alarm rate of traffic incident detection on highways. We develop a weak supervised learning workflow to generate high-quality training labels for the incident data without the ground truth labels, and we use those generated labels in the supervised learning setup for final detection. This approach comprises three stages. First, we introduce a data preprocessing and curation pipeline that processes traffic sensor data to generate high-quality training data through leveraging labeling functions, which can be domain knowledge-related or simple heuristic rules. Second, we evaluate the training data generated by weak supervision using three supervised learning models -- random forest, k-nearest neighbors, and a support vector machine ensemble -- and long short-term memory classifiers. The results show that the accuracy of all of the models improves significantly after using the training data generated by weak supervision. Third, we develop an online real-time incident detection approach that leverages the model ensemble and the uncertainty quantification while detecting incidents. Overall, we show that our proposed weak supervised learning workflow achieves a high incident detection rate (0.90) and low false alarm rate (0.08).

View on arXiv PDF

Similar