Federated Extra-Trees with Privacy Preserving
It addresses data privacy and security issues for organizations under regulations like GDPR, offering a practical solution to the data island problem.
The paper tackles the challenge of training machine learning models on decentralized data while preserving privacy, proposing Federated Extra-Trees with local differential privacy, and validates it with experiments on public datasets showing superior performance.
It is commonly observed that the data are scattered everywhere and difficult to be centralized. The data privacy and security also become a sensitive topic. The laws and regulations such as the European Union's General Data Protection Regulation (GDPR) are designed to protect the public's data privacy. However, machine learning requires a large amount of data for better performance, and the current circumstances put deploying real-life AI applications in an extremely difficult situation. To tackle these challenges, in this paper we propose a novel privacy-preserving federated machine learning model, named Federated Extra-Trees, which applies local differential privacy in the federated trees model. A secure multi-institutional machine learning system was developed to provide superior performance by processing the modeling jointly on different clients without exchanging any raw data. We have validated the accuracy of our work by conducting extensive experiments on public datasets and the efficiency and robustness were also verified by simulating the real-world scenarios. Overall, we presented an extensible, scalable and practical solution to handle the data island problem.