Jonas Wessner, Tobias Meuser, Janek Schoffit et al.
Configuring network access control policies in large, complex networks is error-prone and requires significant expert effort. LLMs offer a promising interface for expressing such policies in natural language, but their capability for translating user requests into access policies, and the system architectures best suited to leverage LLMs, remain underexplored. We present an architecture for natural-language access control (NLAC) that uses LLMs to translate user requests into access policies, and introduce NLACBench, a benchmark for evaluating LLM-based intent translation systems in large-scale networks. Our evaluation across multiple state-of-the-art models shows that top-performing LLMs achieve up to 96.9% accuracy in small-network settings, but performance degrades substantially (below 20% for some models) as network size increases. To address this limitation, we identify relevant network components via embedding similarity and construct compact subgraphs that are passed to the LLM. This approach enables scaling to larger networks with up to 98.7% accuracy, while simultaneously reducing inference time, hardware requirements, and operating costs to a constant resource budget. Finally, a case study indicates that top-performing models exhibit largely complementary error patterns, suggesting that intent translation accuracy may be further improved through multi-LLM architectures.