Fatih Türkmen

h-index13

7papers

36citations

Novelty41%

AI Score36

Ranked #101,387 of 194,257 authors (top 52%)#2,494 in CR (top 37%)

7 Papers

5.8CRApr 27, 2024

Privacy-Preserving Aggregation for Decentralized Learning with Byzantine-Robustness

Ali Reza Ghavamipour, Benjamin Zi Hao Zhao, Oguzhan Ersoy et al.

Decentralized machine learning (DL) has been receiving an increasing interest recently due to the elimination of a single point of failure, present in Federated learning setting. Yet, it is threatened by the looming threat of Byzantine clients who intentionally disrupt the learning process by broadcasting arbitrary model updates to other clients, seeking to degrade the performance of the global model. In response, robust aggregation schemes have emerged as promising solutions to defend against such Byzantine clients, thereby enhancing the robustness of Decentralized Learning. Defenses against Byzantine adversaries, however, typically require access to the updates of other clients, a counterproductive privacy trade-off that in turn increases the risk of inference attacks on those same model updates. In this paper, we introduce SecureDL, a novel DL protocol designed to enhance the security and privacy of DL against Byzantine threats. SecureDL~facilitates a collaborative defense, while protecting the privacy of clients' model updates through secure multiparty computation. The protocol employs efficient computation of cosine similarity and normalization of updates to robustly detect and exclude model updates detrimental to model convergence. By using MNIST, Fashion-MNIST, SVHN and CIFAR-10 datasets, we evaluated SecureDL against various Byzantine attacks and compared its effectiveness with four existing defense mechanisms. Our experiments show that SecureDL is effective even in the case of attacks by the malicious majority (e.g., 80% Byzantine clients) while preserving high training accuracy.

2.3CRApr 27, 2024

Privacy-Preserving, Dropout-Resilient Aggregation in Decentralized Learning

Ali Reza Ghavamipour, Benjamin Zi Hao Zhao, Fatih Turkmen

Decentralized learning (DL) offers a novel paradigm in machine learning by distributing training across clients without central aggregation, enhancing scalability and efficiency. However, DL's peer-to-peer model raises challenges in protecting against inference attacks and privacy leaks. By forgoing central bottlenecks, DL demands privacy-preserving aggregation methods to protect data from 'honest but curious' clients and adversaries, maintaining network-wide privacy. Privacy-preserving DL faces the additional hurdle of client dropout, clients not submitting updates due to connectivity problems or unavailability, further complicating aggregation. This work proposes three secret sharing-based dropout resilience approaches for privacy-preserving DL. Our study evaluates the efficiency, performance, and accuracy of these protocols through experiments on datasets such as MNIST, Fashion-MNIST, SVHN, and CIFAR-10. We compare our protocols with traditional secret-sharing solutions across scenarios, including those with up to 1000 clients. Evaluations show that our protocols significantly outperform conventional methods, especially in scenarios with up to 30% of clients dropout and model sizes of up to $10^6$ parameters. Our approaches demonstrate markedly high efficiency with larger models, higher dropout rates, and extensive client networks, highlighting their effectiveness in enhancing decentralized learning systems' privacy and dropout robustness.

2.7CLOct 15, 2025

Personal Attribute Leakage in Federated Speech Models

Hamdan Al-Ali, Ali Reza Ghavamipour, Tommaso Caselli et al.

Federated learning is a common method for privacy-preserving training of machine learning models. In this paper, we analyze the vulnerability of ASR models to attribute inference attacks in the federated setting. We test a non-parametric white-box attack method under a passive threat model on three ASR models: Wav2Vec2, HuBERT, and Whisper. The attack operates solely on weight differentials without access to raw speech from target speakers. We demonstrate attack feasibility on sensitive demographic and clinical attributes: gender, age, accent, emotion, and dysarthria. Our findings indicate that attributes that are underrepresented or absent in the pre-training data are more vulnerable to such inference attacks. In particular, information about accents can be reliably inferred from all models. Our findings expose previously undocumented vulnerabilities in federated ASR models and offer insights towards improved security.

3.8CRNov 19, 2021

Blockchain for Genomics: A Systematic Literature Review

Mohammed Alghazwi, Fatih Turkmen, Joeri van der Velde et al.

Human genomic data carry unique information about an individual and offer unprecedented opportunities for healthcare. The clinical interpretations derived from large genomic datasets can greatly improve healthcare and pave the way for personalized medicine. Sharing genomic datasets, however, pose major challenges, as genomic data is different from traditional medical data, indirectly revealing information about descendants and relatives of the data owner and carrying valid information even after the owner passes away. Therefore, stringent data ownership and control measures are required when dealing with genomic data. In order to provide secure and accountable infrastructure, blockchain technologies offer a promising alternative to traditional distributed systems. Indeed, the research on blockchain-based infrastructures tailored to genomics is on the rise. However, there is a lack of a comprehensive literature review that summarizes the current state-of-the-art methods in the applications of blockchain in genomics. In this paper, we systematically look at the existing work both commercial and academic, and discuss the major opportunities and challenges. Our study is driven by five research questions that we aim to answer in our review. We also present our projections of future research directions which we hope the researchers interested in the area can benefit from.

3.8CRMay 20, 2021

KotlinDetector: Towards Understanding the Implications of Using Kotlin in Android Applications

Fadi Mohsen, Loran Oosterhaven, Fatih Turkmen

Java programming language has been long used to develop native Android mobile applications. In the last few years many companies and freelancers have switched into using Kotlin partially or entirely. As such, many projects are released as binaries and employ a mix of Java and Kotlin language constructs. Yet, the true security and privacy implications of this shift have not been thoroughly studied. In this work, a state-of-the-art tool, KotlinDetector, is developed to directly extract any Kotlin presence, percentages, and numerous language features from Android Application Packages (APKs) by performing heuristic pattern scanning and invocation tracing. Our evaluation study shows that the tool is considerably efficient and accurate. We further provide a use case in which the output of the KotlinDetector is combined with the output of an existing vulnerability scanner tool called AndroBugs to infer any security and/or privacy implications.

4.4LGMay 14, 2021Code

Privacy-preserving Logistic Regression with Secret Sharing

Ali Reza Ghavamipour, Fatih Turkmen, Xiaoqian Jian

Logistic regression (LR) is a widely used classification method for modeling binary outcomes in many medical data classification tasks. Research that collects and combines datasets from various data custodians and jurisdictions can excessively benefit from the increased statistical power to support their analyzing goals. However, combining data from these various sources creates significant privacy concerns that need to be addressed. In this paper, we proposed secret sharing-based privacy-preserving logistic regression protocols using the Newton-Raphson method. Our proposed approaches are based on secure Multi-Party Computation (MPC) with different security settings to analyze data owned by several data holders. We conducted experiments on both synthetic data and real-world datasets and compared the efficiency and accuracy of them with those of an ordinary logistic regression model. Experimental results demonstrate that the proposed protocols are highly efficient and accurate. This study introduces iterative algorithms to simplify the federated training a logistic regression model in a privacy-preserving manner. Our implementation results show that our improved method can handle large datasets used in securely training a logistic regression from multiple sources.

6.6CRApr 3, 2021Code

Private Computation of Polynomials over Networks

Teimour Hosseinalizadeh, Fatih Turkmen, Nima Monshizadeh

This study concentrates on preserving privacy in a network of agents where each agent seeks to evaluate a general polynomial function over the private values of her immediate neighbors. We provide an algorithm for the exact evaluation of such functions while preserving privacy of the involved agents. The solution is based on a reformulation of polynomials and adoption of two cryptographic primitives: Paillier as a Partially Homomorphic Encryption scheme and multiplicative-additive secret sharing. The provided algorithm is fully distributed, lightweight in communication, robust to dropout of agents, and can accommodate a wide class of functions. Moreover, system theoretic and secure multi-party conditions guaranteeing the privacy preservation of an agent's private values against a set of colluding agents are established. The theoretical developments are complemented by numerical investigations illustrating the accuracy of the algorithm and the resulting computational cost.