Fabrice Labeau

LG
8papers
95citations
Novelty46%
AI Score25

8 Papers

LGNov 6, 2023
Preserving Privacy in GANs Against Membership Inference Attack

Mohammadhadi Shateri, Francisco Messina, Fabrice Labeau et al.

Generative Adversarial Networks (GANs) have been widely used for generating synthetic data for cases where there is a limited size real-world dataset or when data holders are unwilling to share their data samples. Recent works showed that GANs, due to overfitting and memorization, might leak information regarding their training data samples. This makes GANs vulnerable to Membership Inference Attacks (MIAs). Several defense strategies have been proposed in the literature to mitigate this privacy issue. Unfortunately, defense strategies based on differential privacy are proven to reduce extensively the quality of the synthetic data points. On the other hand, more recent frameworks such as PrivGAN and PAR-GAN are not suitable for small-size training datasets. In the present work, the overfitting in GANs is studied in terms of the discriminator, and a more general measure of overfitting based on the Bhattacharyya coefficient is defined. Then, inspired by Fano's inequality, our first defense mechanism against MIAs is proposed. This framework, which requires only a simple modification in the loss function of GANs, is referred to as the maximum entropy GAN or MEGAN and significantly improves the robustness of GANs to MIAs. As a second defense strategy, a more heuristic model based on minimizing the information leaked from generated samples about the training data points is presented. This approach is referred to as mutual information minimization GAN (MIMGAN) and uses a variational representation of the mutual information to minimize the information that a synthetic sample might leak about the whole training data set. Applying the proposed frameworks to some commonly used data sets against state-of-the-art MIAs reveals that the proposed methods can reduce the accuracy of the adversaries to the level of random guessing accuracy with a small reduction in the quality of the synthetic data samples.

LGOct 27, 2023
$α$-Mutual Information: A Tunable Privacy Measure for Privacy Protection in Data Sharing

MirHamed Jafarzadeh Asl, Mohammadhadi Shateri, Fabrice Labeau

This paper adopts Arimoto's $α$-Mutual Information as a tunable privacy measure, in a privacy-preserving data release setting that aims to prevent disclosing private data to adversaries. By fine-tuning the privacy metric, we demonstrate that our approach yields superior models that effectively thwart attackers across various performance dimensions. We formulate a general distortion-based mechanism that manipulates the original data to offer privacy protection. The distortion metrics are determined according to the data structure of a specific experiment. We confront the problem expressed in the formulation by employing a general adversarial deep learning framework that consists of a releaser and an adversary, trained with opposite goals. This study conducts empirical experiments on images and time-series data to verify the functionality of $α$-Mutual Information. We evaluate the privacy-utility trade-off of customized models and compare them to mutual information as the baseline measure. Finally, we analyze the consequence of an attacker's access to side information about private data and witness that adapting the privacy measure results in a more refined model than the state-of-the-art in terms of resiliency against side information.

LGJun 12, 2021
Adversarial Robustness via Fisher-Rao Regularization

Marine Picot, Francisco Messina, Malik Boudiaf et al.

Adversarial robustness has become a topic of growing interest in machine learning since it was observed that neural networks tend to be brittle. We propose an information-geometric formulation of adversarial defense and introduce FIRE, a new Fisher-Rao regularization for the categorical cross-entropy loss, which is based on the geodesic distance between the softmax outputs corresponding to natural and perturbed input features. Based on the information-geometric properties of the class of softmax distributions, we derive an explicit characterization of the Fisher-Rao Distance (FRD) for the binary and multiclass cases, and draw some interesting properties as well as connections with standard regularization metrics. Furthermore, for a simple linear and Gaussian model, we show that all Pareto-optimal points in the accuracy-robustness region can be reached by FIRE while other state-of-the-art methods fail. Empirically, we evaluate the performance of various classifiers trained with the proposed loss on standard datasets, showing up to a simultaneous 1\% of improvement in terms of clean and robust performances while reducing the training time by 20\% over the best-performing methods.

LGNov 20, 2020
Deep Directed Information-Based Learning for Privacy-Preserving Smart Meter Data Release

Mohammadhadi Shateri, Francisco Messina, Pablo Piantanida et al.

The explosion of data collection has raised serious privacy concerns in users due to the possibility that sharing data may also reveal sensitive information. The main goal of a privacy-preserving mechanism is to prevent a malicious third party from inferring sensitive information while keeping the shared data useful. In this paper, we study this problem in the context of time series data and smart meters (SMs) power consumption measurements in particular. Although Mutual Information (MI) between private and released variables has been used as a common information-theoretic privacy measure, it fails to capture the causal time dependencies present in the power consumption time series data. To overcome this limitation, we introduce the Directed Information (DI) as a more meaningful measure of privacy in the considered setting and propose a novel loss function. The optimization is then performed using an adversarial framework where two Recurrent Neural Networks (RNNs), referred to as the releaser and the adversary, are trained with opposite goals. Our empirical studies on real-world data sets from SMs measurements in the worst-case scenario where an attacker has access to all the training data set used by the releaser, validate the proposed method and show the existing trade-offs between privacy and utility.

SPMar 11, 2020
Privacy-Preserving Adversarial Network (PPAN) for Continuous non-Gaussian Attributes

Mohammadhadi Shateri, Fabrice Labeau

A privacy-preserving adversarial network (PPAN) was recently proposed as an information-theoretical framework to address the issue of privacy in data sharing. The main idea of this model was using mutual information as the privacy measure and adversarial training of two deep neural networks, one as the mechanism and another as the adversary. The performance of the PPAN model for the discrete synthetic data, MNIST handwritten digits, and continuous Gaussian data was evaluated compared to the analytically optimal trade-off. In this study, we evaluate the PPAN model for continuous non-Gaussian data where lower and upper bounds of the privacy-preserving problem are used. These bounds include the Kraskov (KSG) estimation of entropy and mutual information that is based on k-th nearest neighbor. In addition to the synthetic data sets, a practical case for hiding the actual electricity consumption from smart meter readings is examined. The results show that for continuous non-Gaussian data, the PPAN model performs within the determined optimal ranges and close to the lower bound.

SPJun 14, 2019
Real-Time Privacy-Preserving Data Release for Smart Meters

Mohammadhadi Shateri, Francisco Messina, Pablo Piantanida et al.

Smart Meters (SMs) are able to share the power consumption of users with utility providers almost in real-time. These fine-grained signals carry sensitive information about users, which has raised serious concerns from the privacy viewpoint. In this paper, we focus on real-time privacy threats, i.e., potential attackers that try to infer sensitive information from SMs data in an online fashion. We adopt an information-theoretic privacy measure and show that it effectively limits the performance of any attacker. Then, we propose a general formulation to design a privatization mechanism that can provide a target level of privacy by adding a minimal amount of distortion to the SMs measurements. On the other hand, to cope with different applications, a flexible distortion measure is considered. This formulation leads to a general loss function, which is optimized using a deep learning adversarial framework, where two neural networks -- referred to as the releaser and the adversary -- are trained with opposite goals. An exhaustive empirical study is then performed to validate the performance of the proposed approach and compare it with state-of-the-art methods for the occupancy detection privacy problem. Finally, we also investigate the impact of data mismatch between the releaser and the attacker.

SYJun 25, 2015
Approximate MMSE Estimator for Linear Dynamic Systems with Gaussian Mixture Noise

Leila Pishdad, Fabrice Labeau

In this work we propose an approximate Minimum Mean-Square Error (MMSE) filter for linear dynamic systems with Gaussian Mixture noise. The proposed estimator tracks each component of the Gaussian Mixture (GM) posterior with an individual filter and minimizes the trace of the covariance matrix of the bank of filters, as opposed to minimizing the MSE of individual filters in the commonly used Gaussian sum filter (GSF). Hence, the spread of means in the proposed method is smaller than that of GSF which makes it more robust to removing components. Consequently, lower complexity reduction schemes can be used with the proposed filter without losing estimation accuracy and precision. This is supported through simulations on synthetic data as well as experimental data related to an indoor localization system. Additionally, we show that in two limit cases the state estimation provided by our proposed method converges to that of GSF, and we provide simulation results supporting this in other cases.

SYJun 25, 2015
Analytic MMSE Bounds in Linear Dynamic Systems with Gaussian Mixture Noise Statistics

Leila Pishdad, Fabrice Labeau

Using state-space representation, mobile object positioning problems can be described as dynamic systems, with the state representing the unknown location and the observations being the information gathered from the location sensors. For linear dynamic systems with Gaussian noise, the Kalman filter provides the Minimum Mean-Square Error (MMSE) state estimation by tracking the posterior. Hence, by approximating non-Gaussian noise distributions with Gaussian Mixtures (GM), a bank of Kalman filters or Gaussian Sum Filter (GSF), can provide the MMSE state estimation. However, the MMSE itself is not analytically tractable. Moreover, the general analytic bounds proposed in the literature are not tractable for GM noise statistics. Hence, in this work, we evaluate the MMSE of linear dynamic systems with GM noise statistics and propose its analytic lower and upper bounds. We provide two analytic upper bounds which are the Mean-Square Errors (MSE) of implementable filters, and we show that based on the shape of the GM noise distributions, the tighter upper bound can be selected. We also show that for highly multimodal GM noise distributions, the bounds and the MMSE converge. Simulation results support the validity of the proposed bounds and their behavior in limits.