Validating Large Language Models with ReLM
This addresses the growing need for efficient and general validation tools for LLMs, which is crucial for mitigating negative effects in AI applications, though it is incremental as it builds on existing query methods.
The paper tackles the problem of validating large language models (LLMs) for issues like memorization, bias, and toxicity by introducing ReLM, a system that uses regular expressions to simplify and formalize evaluations, achieving up to 15x higher system efficiency and 2.5x data efficiency compared to state-of-the-art methods.
Although large language models (LLMs) have been touted for their ability to generate natural-sounding text, there are growing concerns around possible negative effects of LLMs such as data memorization, bias, and inappropriate language. Unfortunately, the complexity and generation capacities of LLMs make validating (and correcting) such concerns difficult. In this work, we introduce ReLM, a system for validating and querying LLMs using standard regular expressions. ReLM formalizes and enables a broad range of language model evaluations, reducing complex evaluation rules to simple regular expression queries. Our results exploring queries surrounding memorization, gender bias, toxicity, and language understanding show that ReLM achieves up to 15x higher system efficiency, 2.5x data efficiency, and increased statistical and prompt-tuning coverage compared to state-of-the-art ad-hoc queries. ReLM offers a competitive and general baseline for the increasingly important problem of LLM validation.