Cédric Gouy‐Pailler

h-index13

16papers

290citations

Novelty53%

AI Score30

Ranked #136,342 of 194,257 authors (top 70%)#30,000 in LG (top 75%)

16 Papers

2.3CRApr 6, 2023

When approximate design for fast homomorphic computation provides differential privacy guarantees

Arnaud Grivet Sébert, Martin Zuber, Oana Stan et al.

While machine learning has become pervasive in as diversified fields as industry, healthcare, social networks, privacy concerns regarding the training data have gained a critical importance. In settings where several parties wish to collaboratively train a common model without jeopardizing their sensitive data, the need for a private training protocol is particularly stringent and implies to protect the data against both the model's end-users and the actors of the training phase. Differential privacy (DP) and cryptographic primitives are complementary popular countermeasures against privacy attacks. Among these cryptographic primitives, fully homomorphic encryption (FHE) offers ciphertext malleability at the cost of time-consuming operations in the homomorphic domain. In this paper, we design SHIELD, a probabilistic approximation algorithm for the argmax operator which is both fast when homomorphically executed and whose inaccuracy is used as a feature to ensure DP guarantees. Even if SHIELD could have other applications, we here focus on one setting and seamlessly integrate it in the SPEED collaborative training framework from "SPEED: Secure, PrivatE, and Efficient Deep learning" (Grivet Sébert et al., 2021) to improve its computational efficiency. After thoroughly describing the FHE implementation of our algorithm and its DP analysis, we present experimental results. To the best of our knowledge, it is the first work in which relaxing the accuracy of an homomorphic calculation is constructively usable as a degree of freedom to achieve better FHE performances.

7.8LGJun 17, 2022

Federated learning with incremental clustering for heterogeneous data

Fabiola Espinoza Castellon, Aurelien Mayoue, Jacques-Henri Sublemontier et al.

Federated learning enables different parties to collaboratively build a global model under the orchestration of a server while keeping the training data on clients' devices. However, performance is affected when clients have heterogeneous data. To cope with this problem, we assume that despite data heterogeneity, there are groups of clients who have similar data distributions that can be clustered. In previous approaches, in order to cluster clients the server requires clients to send their parameters simultaneously. However, this can be problematic in a context where there is a significant number of participants that may have limited availability. To prevent such a bottleneck, we propose FLIC (Federated Learning with Incremental Clustering), in which the server exploits the updates sent by clients during federated training instead of asking them to send their parameters simultaneously. Hence no additional communications between the server and the clients are necessary other than what classical federated learning requires. We empirically demonstrate for various non-IID cases that our approach successfully splits clients into groups following the same data distributions. We also identify the limitations of FLIC by studying its capability to partition clients at the early stages of the federated learning process efficiently. We further address attacks on models as a form of data heterogeneity and empirically show that FLIC is a robust defense against poisoning attacks even when the proportion of malicious clients is higher than 50\%.

8.7CRMay 9, 2022

Protecting Data from all Parties: Combining FHE and DP in Federated Learning

Arnaud Grivet Sébert, Renaud Sirdey, Oana Stan et al.

This paper tackles the problem of ensuring training data privacy in a federated learning context. Relying on Homomorphic Encryption (HE) and Differential Privacy (DP), we propose a framework addressing threats on the privacy of the training data. Notably, the proposed framework ensures the privacy of the training data from all actors of the learning process, namely the data owners and the aggregating server. More precisely, while HE blinds a semi-honest server during the learning protocol, DP protects the data from semi-honest clients participating in the training process as well as end-users with black-box or white-box access to the trained model. In order to achieve this, we provide new theoretical and practical results to allow these techniques to be rigorously combined. In particular, by means of a novel stochastic quantisation operator, we prove DP guarantees in a context where the noise is quantised and bounded due to the use of HE. The paper is concluded by experiments which show the practicality of the entire framework in terms of both model quality (impacted by DP) and computational overhead (impacted by HE).

4.6LGJul 16, 2024

Dataset Dictionary Learning in a Wasserstein Space for Federated Domain Adaptation

Eduardo Fernandes Montesuma, Fabiola Espinoza Castellon, Fred Ngolè Mboula et al.

Multi-Source Domain Adaptation (MSDA) is a challenging scenario where multiple related and heterogeneous source datasets must be adapted to an unlabeled target dataset. Conventional MSDA methods often overlook that data holders may have privacy concerns, hindering direct data sharing. In response, decentralized MSDA has emerged as a promising strategy to achieve adaptation without centralizing clients' data. Our work proposes a novel approach, Decentralized Dataset Dictionary Learning, to address this challenge. Our method leverages Wasserstein barycenters to model the distributional shift across multiple clients, enabling effective adaptation while preserving data privacy. Specifically, our algorithm expresses each client's underlying distribution as a Wasserstein barycenter of public atoms, weighted by private barycentric coordinates. Our approach ensures that the barycentric coordinates remain undisclosed throughout the adaptation process. Extensive experimentation across five visual domain adaptation benchmarks demonstrates the superiority of our strategy over existing decentralized MSDA techniques. Moreover, our method exhibits enhanced robustness to client parallelism while maintaining relative resilience compared to conventional decentralized MSDA methodologies.

9.9LGFeb 22, 2021

On the robustness of randomized classifiers to adversarial examples

Rafael Pinot, Laurent Meunier, Florian Yger et al.

This paper investigates the theory of robustness against adversarial attacks. We focus on randomized classifiers (\emph{i.e.} classifiers that output random variables) and provide a thorough analysis of their behavior through the lens of statistical learning theory and information theory. To this aim, we introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics. Equipped with this definition, we make two new contributions. The first one consists in devising a new upper bound on the adversarial generalization gap of randomized classifiers. More precisely, we devise bounds on the generalization gap and the adversarial gap (\emph{i.e.} the gap between the risk and the worst-case risk under attack) of randomized classifiers. The second contribution presents a yet simple but efficient noise injection method to design robust randomized classifiers. We show that our results are applicable to a wide range of machine learning models under mild hypotheses. We further corroborate our findings with experimental results using deep neural networks on standard image datasets, namely CIFAR-10 and CIFAR-100. All robust models we trained models can simultaneously achieve state-of-the-art accuracy (over $0.82$ clean accuracy on CIFAR-10) and enjoy \emph{guaranteed} robust accuracy bounds ($0.45$ against $\ell_2$ adversaries with magnitude $0.5$ on CIFAR-10).

17.0CRJun 16, 2020Code

SPEED: Secure, PrivatE, and Efficient Deep learning

Arnaud Grivet Sébert, Rafael Pinot, Martin Zuber et al.

We introduce a deep learning framework able to deal with strong privacy constraints. Based on collaborative learning, differential privacy and homomorphic encryption, the proposed approach advances state-of-the-art of private deep learning against a wider range of threats, in particular the honest-but-curious server assumption. We address threats from both the aggregation server, the global model and potentially colluding data holders. Building upon distributed differential privacy and a homomorphic argmax operator, our method is specifically designed to maintain low communication loads and efficiency. The proposed method is supported by carefully crafted theoretical results. We provide differential privacy guarantees from the point of view of any entity having access to the final model, including colluding data holders, as a function of the ratio of data holders who kept their noise secret. This makes our method practical to real-life scenarios where data holders do not trust any third party to process their datasets nor the other data holders. Crucially the computational burden of the approach is maintained reasonable, and, to the best of our knowledge, our framework is the first one to be efficient enough to investigate deep learning applications while addressing such a large scope of threats. To assess the practical usability of our framework, experiments have been carried out on image datasets in a classification context. We present numerical results that show that the learning procedure is both accurate and private.

9.9LGJun 19, 2019

A unified view on differential privacy and robustness to adversarial examples

Rafael Pinot, Florian Yger, Cédric Gouy-Pailler et al.

This short note highlights some links between two lines of research within the emerging topic of trustworthy machine learning: differential privacy and robustness to adversarial examples. By abstracting the definitions of both notions, we show that they build upon the same theoretical ground and hence results obtained so far in one domain can be transferred to the other. More precisely, our analysis is based on two key elements: probabilistic mappings (also called randomized algorithms in the differential privacy community), and the Renyi divergence which subsumes a large family of divergences. We first generalize the definition of robustness against adversarial examples to encompass probabilistic mappings. Then we observe that Renyi-differential privacy (a generalization of differential privacy recently proposed in~\cite{Mironov2017RenyiDP}) and our definition of robustness share several similarities. We finally discuss how can both communities benefit from this connection to transfer technical tools from one research field to the other.

20.0LGFeb 4, 2019

Theoretical evidence for adversarial robustness through randomization

Rafael Pinot, Laurent Meunier, Alexandre Araujo et al.

This paper investigates the theory of robustness against adversarial attacks. It focuses on the family of randomization techniques that consist in injecting noise in the network at inference time. These techniques have proven effective in many contexts, but lack theoretical arguments. We close this gap by presenting a theoretical analysis of these approaches, hence explaining why they perform well in practice. More precisely, we make two new contributions. The first one relates the randomization rate to robustness to adversarial attacks. This result applies for the general family of exponential distributions, and thus extends and unifies the previous approaches. The second contribution consists in devising a new upper bound on the adversarial generalization gap of randomized neural networks. We support our theoretical claims with a set of experiments.

9.2DSMar 10, 2018

Graph-based Clustering under Differential Privacy

Rafael Pinot, Anne Morvan, Florian Yger et al.

In this paper, we present the first differentially private clustering method for arbitrary-shaped node clusters in a graph. This algorithm takes as input only an approximate Minimum Spanning Tree (MST) $\mathcal{T}$ released under weight differential privacy constraints from the graph. Then, the underlying nonconvex clustering partition is successfully recovered from cutting optimal cuts on $\mathcal{T}$. As opposed to existing methods, our algorithm is theoretically well-motivated. Experiments support our theoretical findings.

1.5LGFeb 12, 2018

On the Needs for Rotations in Hypercubic Quantization Hashing

Anne Morvan, Antoine Souloumiac, Krzysztof Choromanski et al.

The aim of this paper is to endow the well-known family of hypercubic quantization hashing methods with theoretical guarantees. In hypercubic quantization, applying a suitable (random or learned) rotation after dimensionality reduction has been experimentally shown to improve the results accuracy in the nearest neighbors search problem. We prove in this paper that the use of these rotations is optimal under some mild assumptions: getting optimal binary sketches is equivalent to applying a rotation uniformizing the diagonal of the covariance matrix between data points. Moreover, for two closed points, the probability to have dissimilar binary sketches is upper bounded by a factor of the initial distance between the data points. Relaxing these assumptions, we obtain a general concentration result for random matrices. We also provide some experiments illustrating these theoretical points and compare a set of algorithms in both the batch and online settings.

3.7LGMay 22, 2017

Streaming Binary Sketching based on Subspace Tracking and Diagonal Uniformization

Anne Morvan, Antoine Souloumiac, Cédric Gouy-Pailler et al.

In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary compressed representations -sketches- of high-dimensional real feature vectors. Given an expected code length $c$ and high-dimensional input data points, our algorithm provides a $c$-bits binary code for preserving the distance between the points from the original high-dimensional space. Our algorithm does not require neither the storage of the whole dataset nor a chunk, thus it is fully adaptable to the streaming setting. It also provides low time complexity and convergence guarantees. We demonstrate the quality of our binary sketches through experiments on real data for the nearest neighbors search task in the online setting.

8.8LGOct 19, 2016

Structured adaptive and random spinners for fast machine learning computations

Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski et al.

We consider an efficient computational framework for speeding up several machine learning algorithms with almost no loss of accuracy. The proposed framework relies on projections via structured matrices that we call Structured Spinners, which are formed as products of three structured matrix-blocks that incorporate rotations. The approach is highly generic, i.e. i) structured matrices under consideration can either be fully-randomized or learned, ii) our structured family contains as special cases all previously considered structured schemes, iii) the setting extends to the non-linear case where the projections are followed by non-linear functions, and iv) the method finds numerous applications including kernel approximations via random feature maps, dimensionality reduction algorithms, new fast cross-polytope LSH techniques, deep learning, convex optimization algorithms via Newton sketches, quantization with random projection trees, and more. The proposed framework comes with theoretical guarantees characterizing the capacity of the structured model in reference to its unstructured counterpart and is based on a general theoretical principle that we describe in the paper. As a consequence of our theoretical analysis, we provide the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the HD3HD2HD1 structured matrix [Andoni et al., 2015]. The exhaustive experimental evaluation confirms the accuracy and efficiency of structured spinners for a variety of different applications.

1.2DSSep 29, 2016

Multi-dimensional signal approximation with sparse structured priors using split Bregman iterations

Yoann Isaac, Quentin Barthélemy, Cédric Gouy-Pailler et al.

This paper addresses the structurally-constrained sparse decomposition of multi-dimensional signals onto overcomplete families of vectors, called dictionaries. The contribution of the paper is threefold. Firstly, a generic spatio-temporal regularization term is designed and used together with the standard $\ell_1$ regularization term to enforce a sparse decomposition preserving the spatio-temporal structure of the signal. Secondly, an optimization algorithm based on the split Bregman approach is proposed to handle the associated optimization problem, and its convergence is analyzed. Our well-founded approach yields same accuracy as the other algorithms at the state-of-the-art, with significant gains in terms of convergence speed. Thirdly, the empirical validation of the approach on artificial and real-world problems demonstrates the generality and effectiveness of the method. On artificial problems, the proposed regularization subsumes the Total Variation minimization and recovers the expected decomposition. On the real-world problem of electro-encephalography brainwave decomposition, the approach outperforms similar approaches in terms of P300 evoked potentials detection, using structured spatial priors to guide the decomposition.

3.5LGMay 29, 2016

TripleSpin - a generic compact paradigm for fast machine learning computations

Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler et al.

We present a generic compact computational framework relying on structured random matrices that can be applied to speed up several machine learning algorithms with almost no loss of accuracy. The applications include new fast LSH-based algorithms, efficient kernel computations via random feature maps, convex optimization algorithms, quantization techniques and many more. Certain models of the presented paradigm are even more compressible since they apply only bit matrices. This makes them suitable for deploying on mobile devices. All our findings come with strong theoretical guarantees. In particular, as a byproduct of the presented techniques and by using relatively new Berry-Esseen-type CLT for random vectors, we give the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $\textbf{HD}_{3}\textbf{HD}_{2}\textbf{HD}_{1}$ structured matrix ("Practical and Optimal LSH for Angular Distance"). These guarantees as well as theoretical results for other aforementioned applications follow from the same general theoretical principle that we present in the paper. Our structured family contains as special cases all previously considered structured schemes, including the recently introduced $P$-model. Experimental evaluation confirms the accuracy and efficiency of TripleSpin matrices.

3.3DSMar 21, 2013

Multi-dimensional sparse structured signal approximation using split Bregman iterations

Yoann Isaac, Quentin Barthélemy, Jamal Atif et al.

The paper focuses on the sparse approximation of signals using overcomplete representations, such that it preserves the (prior) structure of multi-dimensional signals. The underlying optimization problem is tackled using a multi-dimensional split Bregman optimization approach. An extensive empirical evaluation shows how the proposed approach compares to the state of the art depending on the signal features.

5.3LGMar 4, 2013

Multivariate Temporal Dictionary Learning for EEG

Quentin Barthélemy, Cédric Gouy-Pailler, Yoann Isaac et al.

This article addresses the issue of representing electroencephalographic (EEG) signals in an efficient way. While classical approaches use a fixed Gabor dictionary to analyze EEG signals, this article proposes a data-driven method to obtain an adapted dictionary. To reach an efficient dictionary learning, appropriate spatial and temporal modeling is required. Inter-channels links are taken into account in the spatial multivariate model, and shift-invariance is used for the temporal model. Multivariate learned kernels are informative (a few atoms code plentiful energy) and interpretable (the atoms can have a physiological meaning). Using real EEG data, the proposed method is shown to outperform the classical multichannel matching pursuit used with a Gabor dictionary, as measured by the representative power of the learned dictionary and its spatial flexibility. Moreover, dictionary learning can capture interpretable patterns: this ability is illustrated on real data, learning a P300 evoked potential.