Molly Wang

h-index1
2papers

2 Papers

AIJul 24, 2025
Simulation-Driven Reinforcement Learning in Queuing Network Routing Optimization

Fatima Al-Ani, Molly Wang, Jevon Charles et al.

This study focuses on the development of a simulation-driven reinforcement learning (RL) framework for optimizing routing decisions in complex queueing network systems, with a particular emphasis on manufacturing and communication applications. Recognizing the limitations of traditional queueing methods, which often struggle with dynamic, uncertain environments, we propose a robust RL approach leveraging Deep Deterministic Policy Gradient (DDPG) combined with Dyna-style planning (Dyna-DDPG). The framework includes a flexible and configurable simulation environment capable of modeling diverse queueing scenarios, disruptions, and unpredictable conditions. Our enhanced Dyna-DDPG implementation incorporates separate predictive models for next-state transitions and rewards, significantly improving stability and sample efficiency. Comprehensive experiments and rigorous evaluations demonstrate the framework's capability to rapidly learn effective routing policies that maintain robust performance under disruptions and scale effectively to larger network sizes. Additionally, we highlight strong software engineering practices employed to ensure reproducibility and maintainability of the framework, enabling practical deployment in real-world scenarios.

LGJul 27, 2025
Spatial-Temporal Reinforcement Learning for Network Routing with Non-Markovian Traffic

Molly Wang, Kin. K Leung

Reinforcement Learning (RL) has been widely used for packet routing in communication networks, but traditional RL methods rely on the Markov assumption that the current state contains all necessary information for decision-making. In reality, internet traffic is non-Markovian, and past states do influence routing performance. Moreover, common deep RL approaches use function approximators, such as neural networks, that do not model the spatial structure in network topologies. To address these shortcomings, we design a network environment with non-Markovian traffic and introduce a spatial-temporal RL (STRL) framework for packet routing. Our approach outperforms traditional baselines by more than 19% during training and 7% for inference despite a change in network topology.