CY LG SOC-PHJun 12, 2020

Modeling bike availability in a bike-sharing system using machine learning

Huthaifa I. Ashqar, Mohammed Elhenawy, Mohammed H. Almannaa, Ahmed Ghanem, Hesham A. Rakha, Leanna House

arXiv:2006.08352v160 citations

AI Analysis

This is an incremental improvement for bike-sharing system operators to optimize resource allocation.

This paper tackled predicting bike availability in a San Francisco bike-sharing system using machine learning, finding that univariate models like Random Forest and LSBoost had lower prediction errors than a multivariate PLSR model, with a 15-minute horizon yielding the least error.

This paper models the availability of bikes at San Francisco Bay Area Bike Share stations using machine learning algorithms. Random Forest (RF) and Least-Squares Boosting (LSBoost) were used as univariate regression algorithms, and Partial Least-Squares Regression (PLSR) was applied as a multivariate regression algorithm. The univariate models were used to model the number of available bikes at each station. PLSR was applied to reduce the number of required prediction models and reflect the spatial correlation between stations in the network. Results clearly show that univariate models have lower error predictions than the multivariate model. However, the multivariate model results are reasonable for networks with a relatively large number of spatially correlated stations. Results also show that station neighbors and the prediction horizon time are significant predictors. The most effective prediction horizon time that produced the least prediction error was 15 minutes.

View on arXiv PDF

Similar