Collision and Preimage Resistance of the Centera Content Address
This addresses the reliability of data storage systems for users relying on content-addressed storage, but it is incremental as it applies existing cryptographic concepts to a specific implementation.
The paper tackles the problem of ensuring uniqueness in Centera's content-addressed storage by analyzing the collision and preimage resistance of its cryptographic hash functions, specifically MD5 and SHA-256, and presents a proof of collision resistance for the Centera Content Address.
Centera uses cryptographic hash functions as a means of addressing stored objects, thus creating a new class of data storage referred to as CAS (content addressed storage). Such hashing serves the useful function of providing a means of uniquely identifying data and providing a global handle to that data, referred to as the Content Address or CA. However, such a model begs the question: how certain can one be that a given CA is indeed unique? In this paper we describe fundamental concepts of cryptographic hash functions, such as collision resistance, pre-image resistance, and second-preimage resistance. We then map these properties to the MD5 and SHA-256 hash algorithms, which are used to generate the Centera content address. Finally, we present a proof of the collision resistance of the Centera Content Address.