Distributed Data Vending on Blockchain
It addresses a key challenge for domains like healthcare, enabling secure data exchange and aggregation, but appears incremental as it builds on existing blockchain and embedding methods.
The paper tackles the trade-off between data retrieval effectiveness and leakage risk in distributed data vending on blockchain, proposing a framework using data embedding and similarity learning, with empirical results showing its effectiveness.
Recent advances in blockchain technologies have provided exciting opportunities for decentralized applications. Specifically, blockchain-based smart contracts enable credible transactions without authorized third parties. The attractive properties of smart contracts facilitate distributed data vending, allowing for proprietary data to be securely exchanged on a blockchain. Distributed data vending can transform domains such as healthcare by encouraging data distribution from owners and enabling large-scale data aggregation. However, one key challenge in distributed data vending is the trade-off dilemma between the effectiveness of data retrieval, and the leakage risk from indexing the data. In this paper, we propose a framework for distributed data vending through a combination of data embedding and similarity learning. We illustrate our framework through a practical scenario of distributing and aggregating electronic medical records on a blockchain. Extensive empirical results demonstrate the effectiveness of our framework.