CRCLMay 12, 2025

Securing Genomic Data Against Inference Attacks in Federated Learning Environments

arXiv:2505.07188v13 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses privacy concerns for genomic data in federated learning, but it is incremental as it identifies vulnerabilities without proposing new solutions.

The study assessed the vulnerability of federated learning for genomic data to inference attacks, finding that Gradient-Based Membership Inference Attack was most effective with a precision of 0.79 and F1-score of 0.87, highlighting privacy risks in current setups.

Federated Learning (FL) offers a promising framework for collaboratively training machine learning models across decentralized genomic datasets without direct data sharing. While this approach preserves data locality, it remains susceptible to sophisticated inference attacks that can compromise individual privacy. In this study, we simulate a federated learning setup using synthetic genomic data and assess its vulnerability to three key attack vectors: Membership Inference Attack (MIA), Gradient-Based Membership Inference Attack, and Label Inference Attack (LIA). Our experiments reveal that Gradient-Based MIA achieves the highest effectiveness, with a precision of 0.79 and F1-score of 0.87, underscoring the risk posed by gradient exposure in federated updates. Additionally, we visualize comparative attack performance through radar plots and quantify model leakage across clients. The findings emphasize the inadequacy of naïve FL setups in safeguarding genomic privacy and motivate the development of more robust privacy-preserving mechanisms tailored to the unique sensitivity of genomic data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes