GNMay 3, 2022
Open vs Closed-ended questions in attitudinal surveys -- comparing, combining, and interpreting using natural language processingVishnu Baburajan, João de Abreu e Silva, Francisco Camara Pereira · mit
To improve the traveling experience, researchers have been analyzing the role of attitudes in travel behavior modeling. Although most researchers use closed-ended surveys, the appropriate method to measure attitudes is debatable. Topic Modeling could significantly reduce the time to extract information from open-ended responses and eliminate subjective bias, thereby alleviating analyst concerns. Our research uses Topic Modeling to extract information from open-ended questions and compare its performance with closed-ended responses. Furthermore, some respondents might prefer answering questions using their preferred questionnaire type. So, we propose a modeling framework that allows respondents to use their preferred questionnaire type to answer the survey and enable analysts to use the modeling frameworks of their choice to predict behavior. We demonstrate this using a dataset collected from the USA that measures the intention to use Autonomous Vehicles for commute trips. Respondents were presented with alternative questionnaire versions (open- and closed- ended). Since our objective was also to compare the performance of alternative questionnaire versions, the survey was designed to eliminate influences resulting from statements, behavioral framework, and the choice experiment. Results indicate the suitability of using Topic Modeling to extract information from open-ended responses; however, the models estimated using the closed-ended questions perform better compared to them. Besides, the proposed model performs better compared to the models used currently. Furthermore, our proposed framework will allow respondents to choose the questionnaire type to answer, which could be particularly beneficial to them when using voice-based surveys.
EMFeb 20, 2023
Attitudes and Latent Class Choice Models using Machine learningLorena Torres Lahoz, Francisco Camara Pereira, Georges Sfeir et al.
Latent Class Choice Models (LCCM) are extensions of discrete choice models (DCMs) that capture unobserved heterogeneity in the choice process by segmenting the population based on the assumption of preference similarities. We present a method of efficiently incorporating attitudinal indicators in the specification of LCCM, by introducing Artificial Neural Networks (ANN) to formulate latent variables constructs. This formulation overcomes structural equations in its capability of exploring the relationship between the attitudinal indicators and the decision choice, given the Machine Learning (ML) flexibility and power in capturing unobserved and complex behavioural features, such as attitudes and beliefs. All of this while still maintaining the consistency of the theoretical assumptions presented in the Generalized Random Utility model and the interpretability of the estimated parameters. We test our proposed framework for estimating a Car-Sharing (CS) service subscription choice with stated preference data from Copenhagen, Denmark. The results show that our proposed approach provides a complete and realistic segmentation, which helps design better policies.
LGMar 17, 2022
Transfer learning for cross-modal demand prediction of bike-share and public transitMingzhuang Hua, Francisco Camara Pereira, Yu Jiang et al.
The urban transportation system is a combination of multiple transport modes, and the interdependencies across those modes exist. This means that the travel demand across different travel modes could be correlated as one mode may receive demand from or create demand for another mode, not to mention natural correlations between different demand time series due to general demand flow patterns across the network. It is expectable that cross-modal ripple effects become more prevalent, with Mobility as a Service. Therefore, by propagating demand data across modes, a better demand prediction could be obtained. To this end, this study explores various machine learning models and transfer learning strategies for cross-modal demand prediction. The trip data of bike-share, metro, and taxi are processed as the station-level passenger flows, and then the proposed prediction method is tested in the large-scale case studies of Nanjing and Chicago. The results suggest that prediction models with transfer learning perform better than unimodal prediction models. Furthermore, stacked Long Short-Term Memory model performs particularly well in cross-modal demand prediction. These results verify our combined method's forecasting improvement over existing benchmarks and demonstrate the good transferability for cross-modal demand prediction in multiple cities.
SYFeb 28, 2023
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement LearningCarolin Schmidt, Daniele Gammelli, Francisco Camara Pereira et al.
Autonomous Mobility-on-Demand (AMoD) systems are an evolving mode of transportation in which a centrally coordinated fleet of self-driving vehicles dynamically serves travel requests. The control of these systems is typically formulated as a large network optimization problem, and reinforcement learning (RL) has recently emerged as a promising approach to solve the open challenges in this space. Recent centralized RL approaches focus on learning from online data, ignoring the per-sample-cost of interactions within real-world transportation systems. To address these limitations, we propose to formalize the control of AMoD systems through the lens of offline reinforcement learning and learn effective control strategies using solely offline data, which is readily available to current mobility operators. We further investigate design decisions and provide empirical evidence based on data from real-world mobility systems showing how offline learning allows to recover AMoD control policies that (i) exhibit performance on par with online methods, (ii) allow for sample-efficient online fine-tuning and (iii) eliminate the need for complex simulation environments. Crucially, this paper demonstrates that offline RL is a promising paradigm for the application of RL-based solutions within economically-critical systems, such as mobility systems.
CLMar 19, 2025Code
A Foundational individual Mobility Prediction Model based on Open-Source Large Language ModelsZhenlin Qin, Leizhen Wang, Francisco Camara Pereira et al.
Large Language Models (LLMs) are widely applied to domain-specific tasks due to their massive general knowledge and remarkable inference capacities. Current studies on LLMs have shown immense potential in applying LLMs to model individual mobility prediction problems. However, most LLM-based mobility prediction models only train on specific datasets or use single well-designed prompts, leading to difficulty in adapting to different cities and users with diverse contexts. To fill these gaps, this paper proposes a unified fine-tuning framework to train a foundational open source LLM-based mobility prediction model. We conducted extensive experiments on six real-world mobility datasets to validate the proposed model. The results showed that the proposed model achieved the best performance in prediction accuracy and transferability over state-of-the-art models based on deep learning and LLMs.
CLOct 10, 2025Code
Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash NarrativesXixi Wang, Jordanka Kovaceva, Miguel Costa et al.
Free-text crash narratives recorded in real-world crash databases have been shown to play a significant role in improving traffic safety. However, large-scale analyses remain difficult to implement as there are no documented tools that can batch process the unstructured, non standardized text content written by various authors with diverse experience and attention to detail. In recent years, Transformer-based pre-trained language models (PLMs), such as Bidirectional Encoder Representations from Transformers (BERT) and large language models (LLMs), have demonstrated strong capabilities across various natural language processing tasks. These models can extract explicit facts from crash narratives, but their performance declines on inference-heavy tasks in, for example, Crash Type identification, which can involve nearly 100 categories. Moreover, relying on closed LLMs through external APIs raises privacy concerns for sensitive crash data. Additionally, these black-box tools often underperform due to limited domain knowledge. Motivated by these challenges, we study whether compact open-source PLMs can support reasoning-intensive extraction from crash narratives. We target two challenging objectives: 1) identifying the Manner of Collision for a crash, and 2) Crash Type for each vehicle involved in the crash event from real-world crash narratives. To bridge domain gaps, we apply fine-tuning techniques to inject task-specific knowledge to LLMs with Low-Rank Adaption (LoRA) and BERT. Experiments on the authoritative real-world dataset Crash Investigation Sampling System (CISS) demonstrate that our fine-tuned compact models outperform strong closed LLMs, such as GPT-4o, while requiring only minimal training resources. Further analysis reveals that the fine-tuned PLMs can capture richer narrative details and even correct some mislabeled annotations in the dataset.
LGOct 9, 2025
Climate Surrogates for Scalable Multi-Agent Reinforcement Learning: A Case Study with CICERO-SCMOskar Bohn Lassen, Serio Angelo Maria Agriesti, Filipe Rodrigues et al.
Climate policy studies require models that capture the combined effects of multiple greenhouse gases on global temperature, but these models are computationally expensive and difficult to embed in reinforcement learning. We present a multi-agent reinforcement learning (MARL) framework that integrates a high-fidelity, highly efficient climate surrogate directly in the environment loop, enabling regional agents to learn climate policies under multi-gas dynamics. As a proof of concept, we introduce a recurrent neural network architecture pretrained on ($20{,}000$) multi-gas emission pathways to surrogate the climate model CICERO-SCM. The surrogate model attains near-simulator accuracy with global-mean temperature RMSE $\approx 0.0004 \mathrm{K}$ and approximately $1000\times$ faster one-step inference. When substituted for the original simulator in a climate-policy MARL setting, it accelerates end-to-end training by $>\!100\times$. We show that the surrogate and simulator converge to the same optimal policies and propose a methodology to assess this property in cases where using the simulator is intractable. Our work allows to bypass the core computational bottleneck without sacrificing policy fidelity, enabling large-scale multi-agent experiments across alternative climate-policy regimes with multi-gas dynamics and high-fidelity climate response.
LGAug 19, 2025
Learning to Learn the Macroscopic Fundamental Diagram using Physics-Informed and meta Machine Learning techniquesAmalie Roark, Serio Agriesti, Francisco Camara Pereira et al.
The Macroscopic Fundamental Diagram is a popular tool used to describe traffic dynamics in an aggregated way, with applications ranging from traffic control to incident analysis. However, estimating the MFD for a given network requires large numbers of loop detectors, which is not always available in practice. This article proposes a framework harnessing meta-learning, a subcategory of machine learning that trains models to understand and adapt to new tasks on their own, to alleviate the data scarcity challenge. The developed model is trained and tested by leveraging data from multiple cities and exploiting it to model the MFD of other cities with different shares of detectors and topological structures. The proposed meta-learning framework is applied to an ad-hoc Multi-Task Physics-Informed Neural Network, specifically designed to estimate the MFD. Results show an average MSE improvement in flow prediction ranging between ~ 17500 and 36000 (depending on the subset of loop detectors tested). The meta-learning framework thus successfully generalizes across diverse urban settings and improves performance on cities with limited data, demonstrating the potential of using meta-learning when a limited number of detectors is available. Finally, the proposed framework is validated against traditional transfer learning approaches and tested with FitFun, a non-parametric model from the literature, to prove its transferability.
CYApr 3, 2025
Scenario Discovery for Urban Planning: The Case of Green Urbanism and the Impact on StressLorena Torres Lahoz, Carlos Lima Azevedo, Leonardo Ancora et al.
Urban environments significantly influence mental health outcomes, yet the role of an effective framework for decision-making under deep uncertainty (DMDU) for optimizing urban policies for stress reduction remains underexplored. While existing research has demonstrated the effects of urban design on mental health, there is a lack of systematic scenario-based analysis to guide urban planning decisions. This study addresses this gap by applying Scenario Discovery (SD) in urban planning to evaluate the effectiveness of urban vegetation interventions in stress reduction across different urban environments using a predictive model based on emotional responses collected from a neuroscience-based outdoor experiment in Lisbon. Combining these insights with detailed urban data from Copenhagen, we identify key intervention thresholds where vegetation-based solutions succeed or fail in mitigating stress responses. Our findings reveal that while increased vegetation generally correlates with lower stress levels, high-density urban environments, crowding, and individual psychological traits (e.g., extraversion) can reduce its effectiveness. This work showcases our Scenario Discovery framework as a systematic approach for identifying robust policy pathways in urban planning, opening the door for its exploration in other urban decision-making contexts where uncertainty and design resiliency are critical.
OCJul 28, 2021
Predictive and Prescriptive Performance of Bike-Sharing Demand Forecasts for Inventory ManagementDaniele Gammelli, Yihua Wang, Dennis Prak et al.
Bike-sharing systems are a rapidly developing mode of transportation and provide an efficient alternative to passive, motorized personal mobility. The asymmetric nature of bike demand causes the need for rebalancing bike stations, which is typically done during night time. To determine the optimal starting inventory level of a station for a given day, a User Dissatisfaction Function (UDF) models user pickups and returns as non-homogeneous Poisson processes with piece-wise linear rates. In this paper, we devise a deep generative model directly applicable in the UDF by introducing a variational Poisson recurrent neural network model (VP-RNN) to forecast future pickup and return rates. We empirically evaluate our approach against both traditional and learning-based forecasting methods on real trip travel data from the city of New York, USA, and show how our model outperforms benchmarks in terms of system efficiency and demand satisfaction. By explicitly focusing on the combination of decision-making algorithms with learning-based forecasting methods, we highlight a number of shortcomings in literature. Crucially, we show how more accurate predictions do not necessarily translate into better inventory decisions. By providing insights into the interplay between forecasts, model assumptions, and decisions, we point out that forecasts and decision models should be carefully evaluated and harmonized to optimally control shared mobility systems.
EMJul 6, 2020
Semi-nonparametric Latent Class Choice Model with a Flexible Class Membership Component: A Mixture Model ApproachGeorges Sfeir, Maya Abou-Zeid, Filipe Rodrigues et al.
This study presents a semi-nonparametric Latent Class Choice Model (LCCM) with a flexible class membership component. The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification with the aim of comparing the two approaches on various measures including prediction accuracy and representation of heterogeneity in the choice process. Mixture models are parametric model-based clustering techniques that have been widely used in areas such as machine learning, data mining and patter recognition for clustering and classification problems. An Expectation-Maximization (EM) algorithm is derived for the estimation of the proposed model. Using two different case studies on travel mode choice behavior, the proposed model is compared to traditional discrete choice models on the basis of parameter estimates' signs, value of time, statistical goodness-of-fit measures, and cross-validation tests. Results show that mixture models improve the overall performance of latent class choice models by providing better out-of-sample prediction accuracy in addition to better representations of heterogeneity without weakening the behavioral and economic interpretability of the choice models.
EMFeb 3, 2020
A Neural-embedded Choice Model: TasteNet-MNL Modeling Taste Heterogeneity with Flexibility and InterpretabilityYafei Han, Francisco Camara Pereira, Moshe Ben-Akiva et al.
Discrete choice models (DCMs) require a priori knowledge of the utility functions, especially how tastes vary across individuals. Utility misspecification may lead to biased estimates, inaccurate interpretations and limited predictability. In this paper, we utilize a neural network to learn taste representation. Our formulation consists of two modules: a neural network (TasteNet) that learns taste parameters (e.g., time coefficient) as flexible functions of individual characteristics; and a multinomial logit (MNL) model with utility functions defined with expert knowledge. Taste parameters learned by the neural network are fed into the choice model and link the two modules. Our approach extends the L-MNL model (Sifringer et al., 2020) by allowing the neural network to learn the interactions between individual characteristics and alternative attributes. Moreover, we formalize and strengthen the interpretability condition - requiring realistic estimates of behavior indicators (e.g., value-of-time, elasticity) at the disaggregated level, which is crucial for a model to be suitable for scenario analysis and policy decisions. Through a unique network architecture and parameter transformation, we incorporate prior knowledge and guide the neural network to output realistic behavior indicators at the disaggregated level. We show that TasteNet-MNL reaches the ground-truth model's predictability and recovers the nonlinear taste functions on synthetic data. Its estimated value-of-time and choice elasticities at the individual level are close to the ground truth. On a publicly available Swissmetro dataset, TasteNet-MNL outperforms benchmarking MNLs and Mixed Logit model's predictability. It learns a broader spectrum of taste variations within the population and suggests a higher average value-of-time.
MLMar 7, 2019
Multi-output Bus Travel Time Prediction with Convolutional LSTM Neural NetworkNiklas Christoffer Petersen, Filipe Rodrigues, Francisco Camara Pereira
Accurate and reliable travel time predictions in public transport networks are essential for delivering an attractive service that is able to compete with other modes of transport in urban areas. The traditional application of this information, where arrival and departure predictions are displayed on digital boards, is highly visible in the city landscape of most modern metropolises. More recently, the same information has become critical as input for smart-phone trip planners in order to alert passengers about unreachable connections, alternative route choices and prolonged travel times. More sophisticated Intelligent Transport Systems (ITS) include the predictions of connection assurance, i.e. to hold back services in case a connecting service is delayed. In order to operate such systems, and to ensure the confidence of passengers in the systems, the information provided must be accurate and reliable. Traditional methods have trouble with this as congestion, and thus travel time variability, increases in cities, consequently making travel time predictions in urban areas a non-trivial task. This paper presents a system for bus travel time prediction that leverages the non-static spatio-temporal correlations present in urban bus networks, allowing the discovery of complex patterns not captured by traditional methods. The underlying model is a multi-output, multi-time-step, deep neural network that uses a combination of convolutional and long short-term memory (LSTM) layers. The method is empirically evaluated and compared to other popular approaches for link travel time prediction and currently available services, including the currently deployed model in Copenhagen, Denmark. We find that the proposed model significantly outperforms all the other methods we compare with, and is able to detect small irregular peaks in bus travel times very quickly.