CRMay 9, 2016

Information Theoretically Secure Databases

arXiv:1605.02646v113.08 citations

Originality Highly original

AI Analysis

This addresses the need for secure data storage without relying on computational assumptions, offering a foundational approach to information-theoretic security in databases.

The paper tackles the problem of designing a database system that ensures information-theoretic security for stored data when not being accessed, proposing a realization based on a re-randomizing database and proving security against adversaries, including those with viruses, by establishing a communication/data tradeoff for learning sparse parities.

We introduce the notion of a database system that is information theoretically "Secure In Between Accesses"--a database system with the properties that 1) users can efficiently access their data, and 2) while a user is not accessing their data, the user's information is information theoretically secure to malicious agents, provided that certain requirements on the maintenance of the database are realized. We stress that the security guarantee is information theoretic and everlasting: it relies neither on unproved hardness assumptions, nor on the assumption that the adversary is computationally or storage bounded. We propose a realization of such a database system and prove that a user's stored information, in between times when it is being legitimately accessed, is information theoretically secure both to adversaries who interact with the database in the prescribed manner, as well as to adversaries who have installed a virus that has access to the entire database and communicates with the adversary. The central idea behind our design is the construction of a "re-randomizing database" that periodically changes the internal representation of the information that is being stored. To ensure security, these remappings of the representation of the data must be made sufficiently often in comparison to the amount of information that is being communicated from the database between remappings and the amount of local memory in the database that a virus may preserve during the remappings. The core of the proof of the security guarantee is the following communication/data tradeoff for the problem of learning sparse parities from uniformly random $n$-bit examples: any algorithm that can learn a parity of size $k$ with probability at least $p$ and extracts at most $r$ bits of information from each example, must see at least $p\cdot \left(\frac{n}{r}\right)^{k/2} c_k$ examples.

View on arXiv PDF

Similar