Optimal Download Cost of Private Information Retrieval for Arbitrary Message Length
This work provides a fundamental theoretical result for PIR, a key privacy mechanism in distributed databases, by extending capacity analysis to arbitrary parameters, though it is incremental as it builds on known capacity formulas.
The paper tackles the problem of determining the minimum download cost for private information retrieval (PIR) schemes with arbitrary message lengths and alphabets, showing that the optimal download cost is given by a ceiling function of the message length divided by the PIR capacity, with specific bounds for mismatched alphabets.
A private information retrieval scheme is a mechanism that allows a user to retrieve any one out of $K$ messages from $N$ non-communicating replicated databases, each of which stores all $K$ messages, without revealing anything about the identity of the desired message index to any individual database. If the size of each message is $L$ bits and the total download required by a PIR scheme from all $N$ databases is $D$ bits, then $D$ is called the download cost and the ratio $L/D$ is called an achievable rate. For fixed $K,N\in\mathbb{N}$, the capacity of PIR, denoted by $C$, is the supremum of achievable rates over all PIR schemes and over all message sizes, and was recently shown to be $C=(1+1/N+1/N^2+\cdots+1/N^{K-1})^{-1}$. In this work, for arbitrary $K, N$, we explore the minimum download cost $D_L$ across all PIR schemes (not restricted to linear schemes) for arbitrary message lengths $L$ under arbitrary choices of alphabet (not restricted to finite fields) for the message and download symbols. If the same $M$-ary alphabet is used for the message and download symbols, then we show that the optimal download cost in $M$-ary symbols is $D_L=\lceil\frac{L}{C}\rceil$. If the message symbols are in $M$-ary alphabet and the downloaded symbols are in $M'$-ary alphabet, then we show that the optimal download cost in $M'$-ary symbols, $D_L\in\left\{\left\lceil \frac{L'}{C}\right\rceil,\left\lceil \frac{L'}{C}\right\rceil-1,\left\lceil \frac{L'}{C}\right\rceil-2\right\}$, where $L'= \lceil L \log_{M'} M\rceil$.