Privacy Preserving PCA for Multiparty Modeling
This addresses privacy concerns in collaborative machine learning for parties with partitioned data, though it is incremental as it adapts existing privacy techniques to PCA.
The paper tackles the problem of performing principal component analysis (PCA) on horizontally partitioned data while preserving privacy, presenting Privacy Preserving PCA (PPPCA) that allows multiparty cooperative execution without sharing plaintext data. Results show that models built using PPPCA achieve the same accuracy as those using centralized plaintext PCA on benchmark and real-world datasets.
In this paper, we present a general multiparty modeling paradigm with Privacy Preserving Principal Component Analysis (PPPCA) for horizontally partitioned data. PPPCA can accomplish multiparty cooperative execution of PCA under the premise of keeping plaintext data locally. We also propose implementations using two techniques, i.e., homomorphic encryption and secret sharing. The output of PPPCA can be sent directly to data consumer to build any machine learning models. We conduct experiments on three UCI benchmark datasets and a real-world fraud detection dataset. Results show that the accuracy of the model built upon PPPCA is the same as the model with PCA that is built based on centralized plaintext data.