Deep Reinforcement Learning for Day-to-day Dynamic Tolling in Tradable Credit Schemes
This addresses the control mechanism challenge in designing tradable credit schemes for transportation management, though it appears incremental in applying existing RL methods to this specific problem.
The paper tackles the day-to-day dynamic tolling problem in tradable credit schemes by formulating it as a Markov Decision Process and solving it with reinforcement learning algorithms, achieving travel times and social welfare comparable to Bayesian optimization benchmarks while demonstrating generalization across capacities and demand levels.
Tradable credit schemes (TCS) are an increasingly studied alternative to congestion pricing, given their revenue neutrality and ability to address issues of equity through the initial credit allocation. Modeling TCS to aid future design and implementation is associated with challenges involving user and market behaviors, demand-supply dynamics, and control mechanisms. In this paper, we focus on the latter and address the day-to-day dynamic tolling problem under TCS, which is formulated as a discrete-time Markov Decision Process and solved using reinforcement learning (RL) algorithms. Our results indicate that RL algorithms achieve travel times and social welfare comparable to the Bayesian optimization benchmark, with generalization across varying capacities and demand levels. We further assess the robustness of RL under different hyperparameters and apply regularization techniques to mitigate action oscillation, which generates practical tolling strategies that are transferable under day-to-day demand and supply variability. Finally, we discuss potential challenges such as scaling to large networks, and show how transfer learning can be leveraged to improve computational efficiency and facilitate the practical deployment of RL-based TCS solutions.