Airavat: An Agentic Framework for Internet Measurement

Alagappan Ramanathan, Eunju Kang, Dongsu Han, Sangeetha Abdu Jyothi

arXiv:2602.20924v11.2h-index: 37

Originality Highly original

AI Analysis

This work addresses the problem of democratizing Internet measurement capabilities for researchers and practitioners by automating expert-level orchestration and verification, representing a novel framework rather than an incremental improvement.

The paper tackles the challenge of automating complex Internet measurement workflows by introducing Airavat, an agentic framework that generates and verifies workflows against methodological standards, demonstrating in case studies that it matches expert solutions and identifies flaws missed by standard testing.

Internet measurement faces twin challenges: complex analyses require expert-level orchestration of tools, yet even syntactically correct implementations can have methodological flaws and can be difficult to verify. Democratizing measurement capabilities thus demands automating both workflow generation and verification against methodological standards established through decades of research. We present Airavat, the first agentic framework for Internet measurement workflow generation with systematic verification and validation. Airavat coordinates a set of agents mirroring expert reasoning: three agents handle problem decomposition, solution design, and code implementation, with assistance from a registry of existing tools. Two specialized engines ensure methodological correctness: a Verification Engine evaluates workflows against a knowledge graph encoding five decades of measurement research, while a Validation Engine identifies appropriate validation techniques grounded in established methodologies. Through four Internet measurement case studies, we demonstrate that Airavat (i) generates workflows matching expert-level solutions, (ii) makes sound architectural decisions, (iii) addresses novel problems without ground truth, and (iv) identifies methodological flaws missed by standard execution-based testing.

View on arXiv PDF

Similar