CRJul 30, 2021
Private Retrieval, Computing and Learning: Recent Progress and Future ChallengesSennur Ulukus, Salman Avestimehr, Michael Gastpar et al.
Most of our lives are conducted in the cyberspace. The human notion of privacy translates into a cyber notion of privacy on many functions that take place in the cyberspace. This article focuses on three such functions: how to privately retrieve information from cyberspace (privacy in information retrieval), how to privately leverage large-scale distributed/parallel processing (privacy in distributed computing), and how to learn/train machine learning models from private data spread across multiple users (privacy in distributed (federated) learning). The article motivates each privacy setting, describes the problem formulation, summarizes breakthrough results in the history of each problem, and gives recent results and discusses some of the major ideas that emerged in each field. In addition, the cross-cutting techniques and interconnections between the three topics are discussed along with a set of open problems and challenges.
ITJan 12, 2018
The Asymptotic Capacity of Private SearchZhen Chen, Zhiying Wang, Syed Jafar
The private search problem is introduced, where a dataset comprised of $L$ i.i.d. records is replicated across $N$ non-colluding servers, each record takes values uniformly from an alphabet of size $K$, and a user wishes to search for all records that match a privately chosen value, without revealing any information about the chosen value to any individual server. The capacity of private search is the maximum number of bits of desired information that can be retrieved per bit of download. The asymptotic (large $K$) capacity of private search is shown to be $1-1/N$, even as the scope of private search is further generalized to allow approximate (OR) search over a number of realizations that grows with $K$. The results are based on the asymptotic behavior of a new converse bound for private information retrieval with arbitrarily dependent messages.
ITSep 10, 2017
The Capacity of $T$-Private Information Retrieval with Private Side InformationZhen Chen, Zhiying Wang, Syed Jafar
We consider the problem of $T$-Private Information Retrieval with private side information (TPIR-PSI). In this problem, $N$ replicated databases store $K$ independent messages, and a user, equipped with a local cache that holds $M$ messages as side information, wishes to retrieve one of the other $K-M$ messages. The desired message index and the side information must remain jointly private even if any $T$ of the $N$ databases collude. We show that the capacity of TPIR-PSI is $\left(1+\frac{T}{N}+\cdots+\left(\frac{T}{N}\right)^{K-M-1}\right)^{-1}$. As a special case obtained by setting $T=1$, this result settles the capacity of PIR-PSI, an open problem previously noted by Kadhe et al. We also consider the problem of symmetric-TPIR with private side information (STPIR-PSI), where the answers from all $N$ databases reveal no information about any other message besides the desired message. We show that the capacity of STPIR-PSI is $1-\frac{T}{N}$ if the databases have access to common randomness (not available to the user) that is independent of the messages, in an amount that is at least $\frac{T}{N-T}$ bits per desired message bit. Otherwise, the capacity of STPIR-PSI is zero.