LG CRNov 10, 2022

Privacy-Preserving Machine Learning for Collaborative Data Sharing via Auto-encoder Latent Space Embeddings

Ana María Quintero-Ossa, Jesús Solano, Hernán Jarcía, David Zarruk, Alejandro Correa Bahnsen, Carlos Valencia

arXiv:2211.05717v24.62 citationsh-index: 13

Originality Incremental advance

AI Analysis

This addresses privacy concerns for organizations needing to share data for collaborative ML tasks, but it appears incremental as it builds on existing auto-encoder and representation learning methods.

The paper tackles the problem of enabling collaborative machine learning without sharing sensitive raw data by using auto-encoder latent space embeddings to generate privacy-preserving data representations, allowing organizations to improve model performance in multi-source scenarios.

Privacy-preserving machine learning in data-sharing processes is an ever-critical task that enables collaborative training of Machine Learning (ML) models without the need to share the original data sources. It is especially relevant when an organization must assure that sensitive data remains private throughout the whole ML pipeline, i.e., training and inference phases. This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data. Thus, organizations can share the data representation to increase machine learning models' performance in scenarios with more than one data source for a shared predictive downstream task.

View on arXiv PDF

Similar