Operationalizing CaMeL: Strengthening LLM Defenses for Enterprise Deployment
This work addresses security challenges for enterprises deploying LLM agents, but it is incremental as it builds on the existing CaMeL framework.
The paper tackled the limitations of CaMeL, a capability-based sandbox for mitigating prompt injection attacks in LLM agents, by proposing engineering improvements such as prompt screening, output auditing, a tiered-risk access model, and a verified intermediate language to expand threat coverage and enhance operational usability for enterprise deployment.
CaMeL (Capabilities for Machine Learning) introduces a capability-based sandbox to mitigate prompt injection attacks in large language model (LLM) agents. While effective, CaMeL assumes a trusted user prompt, omits side-channel concerns, and incurs performance tradeoffs due to its dual-LLM design. This response identifies these issues and proposes engineering improvements to expand CaMeL's threat coverage and operational usability. We introduce: (1) prompt screening for initial inputs, (2) output auditing to detect instruction leakage, (3) a tiered-risk access model to balance usability and control, and (4) a verified intermediate language for formal guarantees. Together, these upgrades align CaMeL with best practices in enterprise security and support scalable deployment.