Transfer Learning In Differential Privacy's Hybrid-Model
This work addresses a transfer learning challenge in differential privacy for scenarios with distribution shifts, offering a method that is incremental but improves efficiency in specific cases.
The paper tackles the problem of machine learning in differential privacy's hybrid-model when the curator's data distribution differs from the local agents', proposing a Subsample-Test-Reweigh scheme that reduces any curator-model DP-learner to a hybrid-model learner with sample complexity dependent on chi-squared divergence, and provides worst-case bounds and specific instances for reduced complexity.
The hybrid-model (Avent et al 2017) in Differential Privacy is a an augmentation of the local-model where in addition to N local-agents we are assisted by one special agent who is in fact a curator holding the sensitive details of n additional individuals. Here we study the problem of machine learning in the hybrid-model where the n individuals in the curators dataset are drawn from a different distribution than the one of the general population (the local-agents). We give a general scheme -- Subsample-Test-Reweigh -- for this transfer learning problem, which reduces any curator-model DP-learner to a hybrid-model learner in this setting using iterative subsampling and reweighing of the n examples held by the curator based on a smooth variation of the Multiplicative-Weights algorithm (introduced by Bun et al, 2020). Our scheme has a sample complexity which relies on the chi-squared divergence between the two distributions. We give worst-case analysis bounds on the sample complexity required for our private reduction. Aiming to reduce said sample complexity, we give two specific instances our sample complexity can be drastically reduced (one instance is analyzed mathematically, while the other - empirically) and pose several directions for follow-up work.