Alejandra Beghelli

NI
h-index12
4papers
32citations
Novelty41%
AI Score41

4 Papers

LGJul 5, 2022
Resource Allocation in Multicore Elastic Optical Networks: A Deep Reinforcement Learning Approach

Juan Pinto-Ríos, Felipe Calderón, Ariel Leiva et al.

A deep reinforcement learning approach is applied, for the first time, to solve the routing, modulation, spectrum and core allocation (RMSCA) problem in dynamic multicore fiber elastic optical networks (MCF-EONs). To do so, a new environment - compatible with OpenAI's Gym - was designed and implemented to emulate the operation of MCF-EONs. The new environment processes the agent actions (selection of route, core and spectrum slot) by considering the network state and physical-layer-related aspects. The latter includes the available modulation formats and their reach and the inter-core crosstalk (XT), an MCF-related impairment. If the resulting quality of the signal is acceptable, the environment allocates the resources selected by the agent. After processing the agent's action, the environment is configured to give the agent a numerical reward and information about the new network state. The blocking performance of four different agents was compared through simulation to 3 baseline heuristics used in MCF-EONs. Results obtained for the NSFNet and COST239 network topologies show that the best-performing agent achieves, on average, up to a four-times decrease in blocking probability concerning the best-performing baseline heuristic methods.

64.2NIMay 3Code
Graph Transformers and Stabilized Reinforcement Learning for Large-Scale Dynamic Routing Modulation and Spectrum Allocation in Elastic Optical Networks

Michael Doherty, Alejandra Beghelli, Laura Toni

Reinforcement learning (RL) has been widely applied to dynamic routing, modulation and spectrum assignment (RMSA) in optical networks, yet no prior work has trained a transformer model for this task. We attribute this to the high data and compute requirements of transformers and potential training instabilities with RL. We address this gap by combining recent advances from the machine learning literature (rotary positional encodings for graph-structured data, off-policy invalid action masking, and valid mass regularization) with GPU-accelerated simulation to achieve, for the first time, stable RL training of a transformer for dynamic RMSA. We demonstrate, through systematic benchmarking against previous RL methods and heuristic algorithms, that ours is the first RL method to exceed all benchmarks, increasing the supportable traffic load by up to 13\%. To demonstrate the scalability of our approach, we train on real network topologies from the TopologyBench database up to 143 nodes and 362 links, with 320 x 12.5\,GHz frequency slot units per link, and 100\,Gbps traffic requests. To our knowledge, these are the largest dynamic RMSA problems to which RL has been applied. We find up to 4\% increased traffic load can be supported at low blocking probability (<0.1\%) with our method compared to the best available benchmark algorithm. We present an ablation study of the components of our training algorithm, the dynamics of the loss function during training, and analyze the allocation decisions of the trained models. We make all code used to produce this paper openly available for reproduction and future benchmarking: https://github.com/micdoh/XLRON.

NIFeb 20, 2025Code
Reinforcement Learning with Graph Attention for Routing and Wavelength Assignment with Lightpath Reuse

Michael Doherty, Alejandra Beghelli

Many works have investigated reinforcement learning (RL) for routing and spectrum assignment on flex-grid networks but only one work to date has examined RL for fixed-grid with flex-rate transponders, despite production systems using this paradigm. Flex-rate transponders allow existing lightpaths to accommodate new services, a task we term routing and wavelength assignment with lightpath reuse (RWA-LR). We re-examine this problem and present a thorough benchmarking of heuristic algorithms for RWA-LR, which are shown to have 6% increased throughput when candidate paths are ordered by number of hops, rather than total length. We train an RL agent for RWA-LR with graph attention networks for the policy and value functions to exploit the graph-structured data. We provide details of our methodology and open source all of our code for reproduction. We outperform the previous state-of-the-art RL approach by 2.5% (17.4 Tbps mean additional throughput) and the best heuristic by 1.2% (8.5 Tbps mean additional throughput). This marginal gain highlights the difficulty in learning effective RL policies on long horizon resource allocation tasks.

NIFeb 18, 2025
Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?

Michael Doherty, Robin Matzner, Rasoul Sadeghi et al.

The application of reinforcement learning (RL) to dynamic resource allocation in optical networks has been the focus of intense research activity in recent years, with almost 100 peer-reviewed papers. We present a review of progress in the field, and identify significant gaps in benchmarking practices and reproducibility. To determine the strongest benchmark algorithms, we systematically evaluate several heuristics across diverse network topologies. We find that path count and sort criteria for path selection significantly affect the benchmark performance. We meticulously recreate the problems from five landmark papers and apply the improved benchmarks. Our comparisons demonstrate that simple heuristics consistently match or outperform the published RL solutions, often with an order of magnitude lower blocking probability. Furthermore, we present empirical lower bounds on network blocking using a novel defragmentation-based method, revealing that potential improvements over the benchmark heuristics are limited to 19-36% increased traffic load for the same blocking performance in our examples. We make our simulation framework and results publicly available to promote reproducible research and standardized evaluation https://doi.org/10.5281/zenodo.12594495.