Domain Private Transformers for Multi-Domain Dialog Systems
This addresses domain leakage in conversational AI, but it is incremental as it builds on existing privacy techniques.
The paper tackles the problem of language models leaking across domains in multi-dialog systems by proposing domain privacy as a metric and a fine-tuning method to improve it, with experiments showing comparable resiliency to differential privacy methods.
Large, general purpose language models have demonstrated impressive performance across many different conversational domains. While multi-domain language models achieve low overall perplexity, their outputs are not guaranteed to stay within the domain of a given input prompt. This paper proposes domain privacy as a novel way to quantify how likely a conditional language model will leak across domains. We also develop policy functions based on token-level domain classification, and propose an efficient fine-tuning method to improve the trained model's domain privacy. Experiments on membership inference attacks show that our proposed method has comparable resiliency to methods adapted from recent literature on differentially private language models.