LGCRSPJun 9, 2020

XOR Mixup: Privacy-Preserving Data Augmentation for One-Shot Federated Learning

arXiv:2006.05148v1125 citations
Originality Highly original
AI Analysis

This addresses data imbalance and privacy issues in federated learning for distributed devices, representing an incremental improvement with a novel method.

The paper tackles the problem of imbalanced and non-IID data distributions in federated learning by proposing XorMixup, a privacy-preserving data augmentation technique, and XorMixFL, a one-shot federated learning framework, achieving up to 17.6% higher accuracy than Vanilla FL on a non-IID MNIST dataset.

User-generated data distributions are often imbalanced across devices and labels, hampering the performance of federated learning (FL). To remedy to this non-independent and identically distributed (non-IID) data problem, in this work we develop a privacy-preserving XOR based mixup data augmentation technique, coined XorMixup, and thereby propose a novel one-shot FL framework, termed XorMixFL. The core idea is to collect other devices' encoded data samples that are decoded only using each device's own data samples. The decoding provides synthetic-but-realistic samples until inducing an IID dataset, used for model training. Both encoding and decoding procedures follow the bit-wise XOR operations that intentionally distort raw samples, thereby preserving data privacy. Simulation results corroborate that XorMixFL achieves up to 17.6% higher accuracy than Vanilla FL under a non-IID MNIST dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes