LGCRNov 18, 2025

Observational Auditing of Label Privacy

arXiv:2511.14084v1
Originality Incremental advance
AI Analysis

This work addresses the problem of resource-intensive privacy auditing for large-scale systems, offering a practical solution for production environments, though it is incremental in extending existing auditing techniques.

The paper tackles the challenge of auditing differential privacy in machine learning systems without modifying the training dataset, introducing an observational framework that extends privacy evaluation to protected attributes like labels, with experiments on Criteo and CIFAR-10 datasets showing its effectiveness.

Differential privacy (DP) auditing is essential for evaluating privacy guarantees in machine learning systems. Existing auditing methods, however, pose a significant challenge for large-scale systems since they require modifying the training dataset -- for instance, by injecting out-of-distribution canaries or removing samples from training. Such interventions on the training data pipeline are resource-intensive and involve considerable engineering overhead. We introduce a novel observational auditing framework that leverages the inherent randomness of data distributions, enabling privacy evaluation without altering the original dataset. Our approach extends privacy auditing beyond traditional membership inference to protected attributes, with labels as a special case, addressing a key gap in existing techniques. We provide theoretical foundations for our method and perform experiments on Criteo and CIFAR-10 datasets that demonstrate its effectiveness in auditing label privacy guarantees. This work opens new avenues for practical privacy auditing in large-scale production environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes