AIPLSENov 12, 2023

Assessing the Interpretability of Programmatic Policies with Large Language Models

arXiv:2311.06979v24 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the problem of evaluating interpretability in programmatic policies for researchers and practitioners, offering a potential tool, though it is incremental as it builds on existing LLM capabilities without fundamentally changing the field.

The paper tackles the lack of systematic evaluation for the interpretability of programmatic policies by introducing a novel metric that uses large language models (LLMs) to assess interpretability, showing that the metric consistently ranks less interpretable programs lower and more interpretable ones higher in tests with synthesized and human-crafted policies for a real-time strategy game.

Although the synthesis of programs encoding policies often carries the promise of interpretability, systematic evaluations were never performed to assess the interpretability of these policies, likely because of the complexity of such an evaluation. In this paper, we introduce a novel metric that uses large-language models (LLM) to assess the interpretability of programmatic policies. For our metric, an LLM is given both a program and a description of its associated programming language. The LLM then formulates a natural language explanation of the program. This explanation is subsequently fed into a second LLM, which tries to reconstruct the program from the natural-language explanation. Our metric then measures the behavioral similarity between the reconstructed program and the original. We validate our approach with synthesized and human-crafted programmatic policies for playing a real-time strategy game, comparing the interpretability scores of these programmatic policies to obfuscated versions of the same programs. Our LLM-based interpretability score consistently ranks less interpretable programs lower and more interpretable ones higher. These findings suggest that our metric could serve as a reliable and inexpensive tool for evaluating the interpretability of programmatic policies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes