Reinforcement Learning-based Adaptive Path Selection for Programmable Networks
This work addresses network congestion management for programmable networks, but it is incremental as it builds on existing methods like Stochastic Learning Automata and In-Band Network Telemetry.
The paper tackles adaptive path selection in programmable networks by implementing a distributed reinforcement learning framework that uses real-time telemetry data, demonstrating convergence to effective paths and adaptation to congestion at line rate in a testbed.
This work presents a proof-of-concept implementation of a distributed, in-network reinforcement learning (IN-RL) framework for adaptive path selection in programmable networks. By combining Stochastic Learning Automata (SLA) with real-time telemetry data collected via In-Band Network Telemetry (INT), the proposed system enables local, data-driven forwarding decisions that adapt dynamically to congestion conditions. The system is evaluated on a Mininet-based testbed using P4-programmable BMv2 switches, demonstrating how our SLA-based mechanism converges to effective path selections and adapts to shifting network conditions at line rate.