LGSep 25, 2025

Downscaling human mobility data based on demographic socioeconomic and commuting characteristics using interpretable machine learning methods

Yuqin Jiang, Andrey A. Popov, Tianle Duan, Qingchun Li

arXiv:2509.21703v14.1h-index: 1

Originality Synthesis-oriented

AI Analysis

This work addresses urban planning and transportation services by providing a method to improve mobility pattern analysis, though it is incremental as it applies existing machine learning methods to a specific dataset.

The study tackled the problem of downscaling origin-destination taxi trip flows in New York City from larger to smaller spatial units by developing correlations with demographic, socioeconomic, and commuting characteristics using machine learning models. The results showed that neural networks performed best on training and testing datasets, while support vector machines had the best generalization ability for downscaling performance.

Understanding urban human mobility patterns at various spatial levels is essential for social science. This study presents a machine learning framework to downscale origin-destination (OD) taxi trips flows in New York City from a larger spatial unit to a smaller spatial unit. First, correlations between OD trips and demographic, socioeconomic, and commuting characteristics are developed using four models: Linear Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and Neural Networks (NN). Second, a perturbation-based sensitivity analysis is applied to interpret variable importance for nonlinear models. The results show that the linear regression model failed to capture the complex variable interactions. While NN performs best with the training and testing datasets, SVM shows the best generalization ability in downscaling performance. The methodology presented in this study provides both analytical advancement and practical applications to improve transportation services and urban development.

View on arXiv PDF

Similar