Condensed Unpredictability
This addresses a limitation in cryptographic key derivation, particularly relevant for applications like biometric data where entropy loss is critical, though it is incremental with caveats on circuit size and source requirements.
The paper tackles the problem of deriving a key with high HILL entropy from an unpredictable source without the significant entropy loss inherent in previous methods like Goldreich-Levin, and shows that in the high entropy regime, unpredictability implies HILL entropy, with a condenser reducing unpredictability entropy loss to 3 bits from a source with k bits.
We consider the task of deriving a key with high HILL entropy from an unpredictable source. Previous to this work, the only known way to transform unpredictability into a key that was $\eps$ indistinguishable from having min-entropy was via pseudorandomness, for example by Goldreich-Levin (GL) hardcore bits. This approach has the inherent limitation that from a source with $k$ bits of unpredictability entropy one can derive a key of length (and thus HILL entropy) at most $k-2\log(1/ε)$ bits. In many settings, e.g. when dealing with biometric data, such a $2\log(1/ε)$ bit entropy loss in not an option. Our main technical contribution is a theorem that states that in the high entropy regime, unpredictability implies HILL entropy. The loss in circuit size in this argument is exponential in the entropy gap $d$. To overcome the above restriction, we investigate if it's possible to first "condense" unpredictability entropy and make the entropy gap small. We show that any source with $k$ bits of unpredictability can be condensed into a source of length $k$ with $k-3$ bits of unpredictability entropy. Our condenser simply "abuses" the GL construction and derives a $k$ bit key from a source with $k$ bits of unpredicatibily. The original GL theorem implies nothing when extracting that many bits, but we show that in this regime, GL still behaves like a "condenser" for unpredictability. This result comes with two caveats (1) the loss in circuit size is exponential in $k$ and (2) we require that the source we start with has \emph{no} HILL entropy (equivalently, one can efficiently check if a guess is correct). We leave it as an intriguing open problem to overcome these restrictions or to prove they're inherent.