Context-Aware Automated Passenger Counting Data Denoising
This work addresses data quality issues for public transport operators and authorities, enabling better ridership estimation and network optimization, but it is incremental as it builds on existing denoising techniques with additional constraints.
The paper tackles the problem of noisy Automatic Passenger Counting (APC) data in public transportation by proposing a denoising algorithm based on constrained integer linear optimization, which improves robustness and accuracy compared to other methods as assessed on real and simulated data from French networks.
A reliable and accurate knowledge of the ridership in public transportation networks is crucial for public transport operators and public authorities to be aware of their network's use and optimize transport offering. Several techniques to estimate ridership exist nowadays, some of them in an automated manner. Among them, Automatic Passenger Counting (APC) systems detect passengers entering and leaving the vehicle at each station of its course. However, data resulting from these systems are often noisy or even biased, resulting in under or overestimation of onboard occupancy. In this work, we propose a denoising algorithm for APC data to improve their robustness and ease their analyzes. The proposed approach consists in a constrained integer linear optimization, taking advantage of ticketing data and historical ridership data to further constrain and guide the optimization. The performances are assessed and compared to other denoising methods on several public transportation networks in France, to manual counts available on one of these networks, and on simulated data.