Simultaneous Private Learning of Multiple Concepts
This addresses a fundamental limitation in private machine learning for scenarios involving multiple tasks, such as multi-label classification, but the results are incremental as they extend known private learning frameworks.
The paper tackles the problem of learning multiple concepts simultaneously under differential privacy, showing that while non-private learning can achieve this with similar sample complexity as single-concept learning, private multi-learning often requires sample complexity that grows polynomially with the number of concepts, with both upper bounds for some classes and lower bounds for simple ones.
We investigate the direct-sum problem in the context of differentially private PAC learning: What is the sample complexity of solving $k$ learning tasks simultaneously under differential privacy, and how does this cost compare to that of solving $k$ learning tasks without privacy? In our setting, an individual example consists of a domain element $x$ labeled by $k$ unknown concepts $(c_1,\ldots,c_k)$. The goal of a multi-learner is to output $k$ hypotheses $(h_1,\ldots,h_k)$ that generalize the input examples. Without concern for privacy, the sample complexity needed to simultaneously learn $k$ concepts is essentially the same as needed for learning a single concept. Under differential privacy, the basic strategy of learning each hypothesis independently yields sample complexity that grows polynomially with $k$. For some concept classes, we give multi-learners that require fewer samples than the basic strategy. Unfortunately, however, we also give lower bounds showing that even for very simple concept classes, the sample cost of private multi-learning must grow polynomially in $k$.