ZKBoost: Zero-Knowledge Verifiable Training for XGBoost
This addresses the need for verifiable training in sensitive applications like healthcare or finance, though it is incremental as it builds on existing XGBoost and ZKP methods.
The paper tackles the problem of ensuring cryptographic guarantees for XGBoost model integrity in sensitive settings by introducing ZKBoost, the first zero-knowledge proof of training protocol for XGBoost, which matches standard XGBoost accuracy within 1% while enabling practical verification without revealing data or parameters.
Gradient boosted decision trees, particularly XGBoost, are among the most effective methods for tabular data. As deployment in sensitive settings increases, cryptographic guarantees of model integrity become essential. We present ZKBoost, the first zero-knowledge proof of training (zkPoT) protocol for XGBoost, enabling model owners to prove correct training on a committed dataset without revealing data or parameters. We make three key contributions: (1) a fixed-point XGBoost implementation compatible with arithmetic circuits, enabling instantiation of efficient zkPoT, (2) a generic template of zkPoT for XGBoost, which can be instantiated with any general-purpose ZKP backend, and (3) vector oblivious linear evaluation (VOLE)-based instantiation resolving challenges in proving nonlinear fixed-point operations. Our fixed-point implementation matches standard XGBoost accuracy within 1\% while enabling practical zkPoT on real-world datasets.