CY AI CL LGSep 6, 2025

ArGen: Auto-Regulation of Generative AI via GRPO and Policy-as-Code

arXiv:2509.07006v12 citationsh-index: 2Has Code

Originality Highly original

AI Analysis

This addresses the need for verifiable compliance and ethical robustness in AI systems, particularly for safe deployment in diverse global contexts, representing a novel integration of methods rather than an incremental improvement.

The paper tackles the problem of aligning Large Language Models with complex, configurable rules for ethical and regulatory compliance, introducing the ArGen framework which achieved a 70.9% improvement in domain-scope adherence over a baseline in a case study on a medical AI assistant guided by Dharmic ethics.

This paper introduces ArGen (Auto-Regulation of Generative AI systems), a framework for aligning Large Language Models (LLMs) with complex sets of configurable, machine-readable rules spanning ethical principles, operational safety protocols, and regulatory compliance standards. Moving beyond just preference-based alignment, ArGen is designed to ensure LLMs adhere to these multifaceted policies through a novel synthesis of principle-based automated reward scoring, Group Relative Policy Optimisation (GRPO), and an Open Policy Agent (OPA) inspired governance layer. This approach provides the technical foundation for achieving and demonstrating compliance with diverse and nuanced governance requirements. To showcase the framework's capability to operationalize a deeply nuanced and culturally-specific value system, we present an in-depth case study: the development of a medical AI assistant guided by principles from Dharmic ethics (such as Ahimsa and Dharma), as derived from texts like the Bhagavad Gita. This challenging application demonstrates ArGen's adaptability, achieving a 70.9% improvement in domain-scope adherence over the baseline. Through our open-source repository, we show that ArGen's methodology offers a path to 'Governable Al' systems that are technically proficient, ethically robust, and verifiably compliant for safe deployment in diverse global contexts.

View on arXiv PDF

Similar