CR AIJan 6, 2025

Proof-of-Data: A Consensus Protocol for Collaborative Intelligence

arXiv:2501.02971v23.61 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses the problem of trust and incentives in collaborative AI for decentralized systems, offering a novel solution but with incremental improvements over existing federated learning approaches.

The paper tackles the challenge of achieving correct model training and fair reward allocation in decentralized federated learning without a central entity, proposing a blockchain-based framework with a Proof-of-Data consensus protocol that achieves performance close to centralized methods and a fault tolerance ratio of 1/3.

Existing research on federated learning has been focused on the setting where learning is coordinated by a centralized entity. Yet the greatest potential of future collaborative intelligence would be unleashed in a more open and democratized setting with no central entity in a dominant role, referred to as "decentralized federated learning". New challenges arise accordingly in achieving both correct model training and fair reward allocation with collective effort among all participating nodes, especially with the threat of the Byzantine node jeopardising both tasks. In this paper, we propose a blockchain-based decentralized Byzantine fault-tolerant federated learning framework based on a novel Proof-of-Data (PoD) consensus protocol to resolve both the "trust" and "incentive" components. By decoupling model training and contribution accounting, PoD is able to enjoy not only the benefit of learning efficiency and system liveliness from asynchronous societal-scale PoW-style learning but also the finality of consensus and reward allocation from epoch-based BFT-style voting. To mitigate false reward claims by data forgery from Byzantine attacks, a privacy-aware data verification and contribution-based reward allocation mechanism is designed to complete the framework. Our evaluation results show that PoD demonstrates performance in model training close to that of the centralized counterpart while achieving trust in consensus and fairness for reward allocation with a fault tolerance ratio of 1/3.

View on arXiv PDF

Similar