gtfs2vec -- Learning GTFS Embeddings for comparing Public Transport Offer in Microregions
This work provides a method for urban planners and policymakers to analyze and compare public transport schedules in microregions, though it is incremental as it applies existing techniques to a new domain.
The researchers tackled the problem of comparing public transport availability across microregions by developing GTFS embeddings from timetables in 48 European cities, using an auto-associative neural network and hierarchical clustering to identify similar regions based on transport characteristics.
We selected 48 European cities and gathered their public transport timetables in the GTFS format. We utilized Uber's H3 spatial index to divide each city into hexagonal micro-regions. Based on the timetables data we created certain features describing the quantity and variety of public transport availability in each region. Next, we trained an auto-associative deep neural network to embed each of the regions. Having such prepared representations, we then used a hierarchical clustering approach to identify similar regions. To do so, we utilized an agglomerative clustering algorithm with a euclidean distance between regions and Ward's method to minimize in-cluster variance. Finally, we analyzed the obtained clusters at different levels to identify some number of clusters that qualitatively describe public transport availability. We showed that our typology matches the characteristics of analyzed cities and allows succesful searching for areas with similar public transport schedule characteristics.