SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs
This addresses privacy concerns for organizations using federated learning with LLMs, offering a solution that balances confidentiality and performance, though it appears incremental as it builds on existing adapter and gating techniques.
The paper tackles the problem of privacy leakage and utility degradation in federated fine-tuning of large language models by proposing SecureGate, a framework that uses token-gated dual adapters to control information disclosure, resulting in up to a 31.66X reduction in inference attack accuracy and a 17.07X reduction in extraction recall while maintaining 100% routing reliability.
Federated learning (FL) enables collaborative training across organizational silos without sharing raw data, making it attractive for privacy-sensitive applications. With the rapid adoption of large language models (LLMs), federated fine-tuning of generative LLMs has gained attention as a way to leverage distributed data while preserving confidentiality. However, this setting introduces fundamental challenges: (i) privacy leakage of personally identifiable information (PII) due to LLM memorization, and (ii) a persistent tension between global generalization and local utility under heterogeneous data. Existing defenses, such as data sanitization and differential privacy, reduce leakage but often degrade downstream performance. We propose SecureGate, a privacy-aware federated fine-tuning framework for LLMs that provides fine-grained privacy control without sacrificing utility. SecureGate employs a dual-adapter LoRA architecture: a secure adapter that learns sanitized, globally shareable representations, and a revealing adapter that captures sensitive, organization-specific knowledge. A token-controlled gating module selectively activates these adapters at inference time, enabling controlled information disclosure without retraining. Extensive experiments across multiple LLMs and real-world datasets show that SecureGate improves task utility while substantially reducing PII leakage, achieving up to a 31.66X reduction in inference attack accuracy and a 17.07X reduction in extraction recall for unauthorized requests. Additionally, it maintains 100% routing reliability to the correct adapter and incurs only minimal computational and communication overhead.