Constance Béguier

h-index5

4papers

195citations

Novelty50%

AI Score28

Ranked #148,450 of 194,257 authors (top 76%)#2,418 in ML (top 72%)

4 Papers

11.1LGOct 4, 2022

SecureFedYJ: a safe feature Gaussianization protocol for Federated Learning

Tanguy Marchand, Boris Muzellec, Constance Beguier et al.

The Yeo-Johnson (YJ) transformation is a standard parametrized per-feature unidimensional transformation often used to Gaussianize features in machine learning. In this paper, we investigate the problem of applying the YJ transformation in a cross-silo Federated Learning setting under privacy constraints. For the first time, we prove that the YJ negative log-likelihood is in fact convex, which allows us to optimize it with exponential search. We numerically show that the resulting algorithm is more stable than the state-of-the-art approach based on the Brent minimization method. Building on this simple algorithm and Secure Multiparty Computation routines, we propose SecureFedYJ, a federated algorithm that performs a pooled-equivalent YJ transformation without leaking more information than the final fitted parameters do. Quantitative experiments on real data demonstrate that, in addition to being secure, our approach reliably normalizes features across silos as well as if data were pooled, making it a viable approach for safe federated feature Gaussianization.

8.4MLJan 8, 2021Code

Differentially Private Federated Learning for Cancer Prediction

Constance Beguier, Jean Ogier du Terrail, Iqraa Meah et al.

Since 2014, the NIH funded iDASH (integrating Data for Analysis, Anonymization, SHaring) National Center for Biomedical Computing has hosted yearly competitions on the topic of private computing for genomic data. For one track of the 2020 iteration of this competition, participants were challenged to produce an approach to federated learning (FL) training of genomic cancer prediction models using differential privacy (DP), with submissions ranked according to held-out test accuracy for a given set of DP budgets. More precisely, in this track, we are tasked with training a supervised model for the prediction of breast cancer occurrence from genomic data split between two virtual centers while ensuring data privacy with respect to model transfer via DP. In this article, we present our 3rd place submission to this competition. During the competition, we encountered two main challenges discussed in this article: i) ensuring correctness of the privacy budget evaluation and ii) achieving an acceptable trade-off between prediction performance and privacy budget.

21.7CVAug 17, 2020

Siloed Federated Learning for Multi-Centric Histopathology Datasets

Mathieu Andreux, Jean Ogier du Terrail, Constance Beguier et al.

While federated learning is a promising approach for training deep learning models over distributed sensitive datasets, it presents new challenges for machine learning, especially when applied in the medical domain where multi-centric data heterogeneity is common. Building on previous domain adaptation works, this paper proposes a novel federated learning approach for deep learning architectures via the introduction of local-statistic batch normalization (BN) layers, resulting in collaboratively-trained, yet center-specific models. This strategy improves robustness to data heterogeneity while also reducing the potential for information leaks by not sharing the center-specific layer activation statistics. We benchmark the proposed method on the classification of tumorous histopathology image patches extracted from the Camelyon16 and Camelyon17 datasets. We show that our approach compares favorably to previous state-of-the-art methods, especially for transfer learning across datasets.

10.3MLJul 29, 2020

Efficient Sparse Secure Aggregation for Federated Learning

Constance Beguier, Mathieu Andreux, Eric W. Tramel

Federated Learning enables one to jointly train a machine learning model across distributed clients holding sensitive datasets. In real-world settings, this approach is hindered by expensive communication and privacy concerns. Both of these challenges have already been addressed individually, resulting in competing optimisations. In this article, we tackle them simultaneously for one of the first times. More precisely, we adapt compression-based federated techniques to additive secret sharing, leading to an efficient secure aggregation protocol, with an adaptable security level. We prove its privacy against malicious adversaries and its correctness in the semi-honest setting. Experiments on deep convolutional networks demonstrate that our secure protocol achieves high accuracy with low communication costs. Compared to prior works on secure aggregation, our protocol has a lower communication and computation costs for a similar accuracy.