Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting
This work aims to facilitate easier experimentation for researchers and practitioners in network management, but it is incremental as it builds on existing AI applications without introducing new methods.
The paper addresses the lack of a standardized platform for building and evaluating AI agents in network troubleshooting, proposing a playground to enable reproducible benchmarking with low operational effort.
Recent research has demonstrated the effectiveness of Artificial Intelligence (AI), and more specifically, Large Language Models (LLMs), in supporting network configuration synthesis and automating network diagnosis tasks, among others. In this preliminary work, we restrict our focus to the application of AI agents to network troubleshooting and elaborate on the need for a standardized, reproducible, and open benchmarking platform, where to build and evaluate AI agents with low operational effort.