LGApr 5, 2022
Privacy-Preserving Federated Learning via System Immersion and Random Matrix EncryptionHaleh Hayati, Carlos Murguia, Nathan van de Wouw
Federated learning (FL) has emerged as a privacy solution for collaborative distributed learning where clients train AI models directly on their devices instead of sharing their data with a centralized (potentially adversarial) server. Although FL preserves local data privacy to some extent, it has been shown that information about clients' data can still be inferred from model updates. In recent years, various privacy-preserving schemes have been developed to address this privacy leakage. However, they often provide privacy at the expense of model performance or system efficiency, and balancing these tradeoffs is a crucial challenge when implementing FL schemes. In this manuscript, we propose a Privacy-Preserving Federated Learning (PPFL) framework built on the synergy of matrix encryption and system immersion tools from control theory. The idea is to immerse the learning algorithm, a Stochastic Gradient Decent (SGD), into a higher-dimensional system (the so-called target system) and design the dynamics of the target system so that: the trajectories of the original SGD are immersed/embedded in its trajectories, and it learns on encrypted data (here we use random matrix encryption). Matrix encryption is reformulated at the server as a random change of coordinates that maps original parameters to a higher-dimensional parameter space and enforces that the target SGD converges to an encrypted version of the original SGD optimal solution. The server decrypts the aggregated model using the left inverse of the immersion map. We show that our algorithm provides the same level of accuracy and convergence rate as the standard FL with a negligible computation cost while revealing no information about the clients' data.
CRSep 25, 2024
Immersion and Invariance-based Coding for Privacy-Preserving Federated LearningHaleh Hayati, Carlos Murguia, Nathan van de Wouw
Federated learning (FL) has emerged as a method to preserve privacy in collaborative distributed learning. In FL, clients train AI models directly on their devices rather than sharing data with a centralized server, which can pose privacy risks. However, it has been shown that despite FL's partial protection of local data privacy, information about clients' data can still be inferred from shared model updates during training. In recent years, several privacy-preserving approaches have been developed to mitigate this privacy leakage in FL, though they often provide privacy at the cost of model performance or system efficiency. Balancing these trade-offs presents a significant challenge in implementing FL schemes. In this manuscript, we introduce a privacy-preserving FL framework that combines differential privacy and system immersion tools from control theory. The core idea is to treat the optimization algorithms used in standard FL schemes (e.g., gradient-based algorithms) as a dynamical system that we seek to immerse into a higher-dimensional system (referred to as the target optimization algorithm). The target algorithm's dynamics are designed such that, first, the model parameters of the original algorithm are immersed in its parameters; second, it operates on distorted parameters; and third, it converges to an encoded version of the true model parameters from the original algorithm. These encoded parameters can then be decoded at the server to retrieve the original model parameters. We demonstrate that the proposed privacy-preserving scheme can be tailored to offer any desired level of differential privacy for both local and global model parameters, while maintaining the same accuracy and convergence rate as standard FL algorithms.
CRAug 3, 2021
Finite Horizon Privacy of Stochastic Dynamical Systems: A Synthesis Framework for Dependent Gaussian MechanismsHaleh Hayati, Carlos Murguia, Nathan van de Wouw
We address the problem of synthesizing distorting mechanisms that maximize privacy of stochastic dynamical systems. Information about the system state is obtained through sensor measurements. This data is transmitted to a remote station through an unsecured/public communication network. We aim to keep part of the system state private (a private output); however, because the network is unsecured, adversaries might access sensor data and input signals, which can be used to estimate private outputs. To prevent an accurate estimation, we pass sensor data and input signals through a distorting (privacy-preserving) mechanism before transmission, and send the distorted data to the trusted user. These mechanisms consist of a coordinate transformation and additive dependent Gaussian vectors. We formulate the synthesis of the distorting mechanisms as a convex program, where we minimize the mutual information (our privacy metric) between an arbitrarily large sequence of private outputs and the disclosed distorted data for desired distortion levels -- how different actual and distorted data are allowed to be.