Ali Sunyaev

LG
h-index35
10papers
148citations
Novelty22%
AI Score22

10 Papers

LGMay 1, 2022
Reward Systems for Trustworthy Medical Federated Learning

Konstantin D. Pandl, Florian Leiser, Scott Thiebes et al.

Federated learning (FL) has received high interest from researchers and practitioners to train machine learning (ML) models for healthcare. Ensuring the trustworthiness of these models is essential. Especially bias, defined as a disparity in the model's predictive performance across different subgroups, may cause unfairness against specific subgroups, which is an undesired phenomenon for trustworthy ML models. In this research, we address the question to which extent bias occurs in medical FL and how to prevent excessive bias through reward systems. We first evaluate how to measure the contributions of institutions toward predictive performance and bias in cross-silo medical FL with a Shapley value approximation method. In a second step, we design different reward systems incentivizing contributions toward high predictive performance or low bias. We then propose a combined reward system that incentivizes contributions toward both. We evaluate our work using multiple medical chest X-ray datasets focusing on patient subgroups defined by patient sex and age. Our results show that we can successfully measure contributions toward bias, and an integrated reward system successfully incentivizes contributions toward a well-performing model with low bias. While the partitioning of scans only slightly influences the overall bias, institutions with data predominantly from one subgroup introduce a favorable bias for this subgroup. Our results indicate that reward systems, which focus on predictive performance only, can transfer model bias against patients to an institutional level. Our work helps researchers and practitioners design reward systems for FL with well-aligned incentives for trustworthy ML.

MASep 28, 2023
Collaborative Distributed Machine Learning

David Jin, Niclas Kannengießer, Sascha Rank et al.

Various collaborative distributed machine learning (CDML) systems, including federated learning systems and swarm learning systems, with diferent key traits were developed to leverage resources for the development and use of machine learning(ML) models in a conidentiality-preserving way. To meet use case requirements, suitable CDML systems need to be selected. However, comparison between CDML systems to assess their suitability for use cases is often diicult. To support comparison of CDML systems and introduce scientiic and practical audiences to the principal functioning and key traits of CDML systems, this work presents a CDML system conceptualization and CDML archetypes.

LGMar 3, 2022
Practitioner Motives to Use Different Hyperparameter Optimization Methods

Niclas Kannengießer, Niklas Hasebrook, Felix Morsbach et al.

Programmatic hyperparameter optimization (HPO) methods, such as Bayesian optimization and evolutionary algorithms, are highly sample-efficient in identifying optimal hyperparameter configurations for machine learning (ML) models. However, practitioners frequently use less efficient methods, such as grid search, which can lead to under-optimized models. We suspect this behavior is driven by a range of practitioner-specific motives. Practitioner motives, however, still need to be clarified to enhance user-centered development of HPO tools. To uncover practitioner motives to use different HPO methods, we conducted 20 semi-structured interviews and an online survey with 49 ML experts. By presenting main goals (e.g., increase ML model understanding) and contextual factors affecting practitioners' selection of HPO methods (e.g., available computer resources), this study offers a conceptual foundation to better understand why practitioners use different HPO methods, supporting development of more user-centered and context-adaptive HPO tools in automated ML.

LGNov 7, 2024
Interplay between Federated Learning and Explainable Artificial Intelligence: a Scoping Review

Luis M. Lopez-Ramos, Florian Leiser, Aditya Rastogi et al.

The joint implementation of federated learning (FL) and explainable artificial intelligence (XAI) could allow training models from distributed data and explaining their inner workings while preserving essential aspects of privacy. Toward establishing the benefits and tensions associated with their interplay, this scoping review maps the publications that jointly deal with FL and XAI, focusing on publications that reported an interplay between FL and model interpretability or post-hoc explanations. Out of the 37 studies meeting our criteria, only one explicitly and quantitatively analyzed the influence of FL on model explanations, revealing a significant research gap. The aggregation of interpretability metrics across FL nodes created generalized global insights at the expense of node-specific patterns being diluted. Several studies proposed FL algorithms incorporating explanation methods to safeguard the learning process against defaulting or malicious nodes. Studies using established FL libraries or following reporting guidelines are a minority. More quantitative research and structured, transparent practices are needed to fully understand their mutual impact and under which conditions it happens.

CRDec 24, 2024
SoK: On the Offensive Potential of AI

Saskia Laura Schröer, Giovanni Apruzzese, Soheil Human et al.

Our society increasingly benefits from Artificial Intelligence (AI). Unfortunately, more and more evidence shows that AI is also used for offensive purposes. Prior works have revealed various examples of use cases in which the deployment of AI can lead to violation of security and privacy objectives. No extant work, however, has been able to draw a holistic picture of the offensive potential of AI. In this SoK paper we seek to lay the ground for a systematic analysis of the heterogeneous capabilities of offensive AI. In particular we (i) account for AI risks to both humans and systems while (ii) consolidating and distilling knowledge from academic literature, expert opinions, industrial venues, as well as laypeople -- all of which being valuable sources of information on offensive AI. To enable alignment of such diverse sources of knowledge, we devise a common set of criteria reflecting essential technological factors related to offensive AI. With the help of such criteria, we systematically analyze: 95 research papers; 38 InfoSec briefings (from, e.g., BlackHat); the responses of a user study (N=549) entailing individuals with diverse backgrounds and expertise; and the opinion of 12 experts. Our contributions not only reveal concerning ways (some of which overlooked by prior work) in which AI can be offensively used today, but also represent a foothold to address this threat in the years to come.

LGMay 1, 2023
Scalable Data Point Valuation in Decentralized Learning

Konstantin D. Pandl, Chun-Yin Huang, Ivan Beschastnikh et al.

Existing research on data valuation in federated and swarm learning focuses on valuing client contributions and works best when data across clients is independent and identically distributed (IID). In practice, data is rarely distributed IID. We develop an approach called DDVal for decentralized data valuation, capable of valuing individual data points in federated and swarm learning. DDVal is based on sharing deep features and approximating Shapley values through a k-nearest neighbor approximation method. This allows for novel applications, for example, to simultaneously reward institutions and individuals for providing data to a decentralized machine learning task. The valuation of data points through DDVal allows to also draw hierarchical conclusions on the contribution of institutions, and we empirically show that the accuracy of DDVal in estimating institutional contributions is higher than existing Shapley value approximation methods for federated learning. Specifically, it reaches a cosine similarity in approximating Shapley values of 99.969 % in both, IID and non-IID data distributions across institutions, compared with 99.301 % and 97.250 % for the best state of the art methods. DDVal scales with the number of data points instead of the number of clients, and has a loglinear complexity. This scales more favorably than existing approaches with an exponential complexity. We show that DDVal is especially efficient in data distribution scenarios with many clients that have few data points - for example, more than 16 clients with 8,000 data points each. By integrating DDVal into a decentralized system, we show that it is not only suitable for centralized federated learning, but also decentralized swarm learning, which aligns well with the research on emerging internet technologies such as web3 to reward users for providing data to algorithms.

LGNov 29, 2021
Architecture Matters: Investigating the Influence of Differential Privacy on Neural Network Design

Felix Morsbach, Tobias Dehling, Ali Sunyaev

One barrier to more widespread adoption of differentially private neural networks is the entailed accuracy loss. To address this issue, the relationship between neural network architectures and model accuracy under differential privacy constraints needs to be better understood. As a first step, we test whether extant knowledge on architecture design also holds in the differentially private setting. Our findings show that it does not; architectures that perform well without differential privacy, do not necessarily do so with differential privacy. Consequently, extant knowledge on neural network architecture design cannot be seamlessly translated into the differential privacy context. Future research is required to better understand the relationship between neural network architectures and model accuracy to enable better architecture design choices under differential privacy constraints.

CRApr 16, 2021
Managing Blockchain Systems and Applications: A Process Model for Blockchain Configurations

Olga Labazova, Erol Kazan, Tobias Dehling et al.

Blockchain is a radical innovation with a unique value proposition that shifts trust from institutions to algorithms. Still, the potential of blockchains remains elusive due to knowledge gaps between computer science research and socio-economic research. Building on information technology governance literature and the theory of coevolution, this study develops a process model for blockchain configurations that captures blockchain capability dimensions and application areas. We demonstrate the applicability of the proposed blockchain configuration process model on four blockchain projects. The proposed blockchain configuration process model assists with the selection and configuration of blockchain systems based on a set of known requirements for a blockchain project. Our findings contribute to research by bridging knowledge gaps between computer science and socio-economic research on blockchain. Specifically, we explore existing blockchain concepts and integrate them in a process model for blockchain configurations.

CRJan 29, 2020
On the Convergence of Artificial Intelligence and Distributed Ledger Technology: A Scoping Review and Future Research Agenda

Konstantin D. Pandl, Scott Thiebes, Manuel Schmidt-Kraepelin et al.

Developments in Artificial Intelligence (AI) and Distributed Ledger Technology (DLT) currently lead to lively debates in academia and practice. AI processes data to perform tasks that were previously thought possible only for humans. DLT has the potential to create consensus over data among a group of participants in uncertain environments. In recent research, both technologies are used in similar and even the same systems. Examples include the design of secure distributed ledgers or the creation of allied learning systems distributed across multiple nodes. This can lead to technological convergence, which in the past, has paved the way for major innovations in information technology. Previous work highlights several potential benefits of the convergence of AI and DLT but only provides a limited theoretical framework to describe upcoming real-world integration cases of both technologies. We aim to contribute by conducting a systematic literature review on previous work and providing rigorously derived future research opportunities. This work helps researchers active in AI or DLT to overcome current limitations in their field, and practitioners to develop systems along with the convergence of both technologies.

CRJun 3, 2019
Mind the Gap: Trade-Offs between Distributed Ledger Technology Characteristics

Niclas Kannengießer, Sebastian Lins, Tobias Dehling et al.

When developing peer-to-peer applications on Distributed Ledger Technology (DLT), a crucial decision is the selection of a suitable DLT design (e.g., Ethereum) because it is hard to change the underlying DLT design post hoc. To facilitate the selection of suitable DLT designs, we review DLT characteristics and identify trade-offs between them. Furthermore, we assess how DLT designs account for these trade-offs and we develop archetypes for DLT designs that cater to specific quality requirements. The main purpose of our article is to introduce scientific and practical audiences to the intricacies of DLT designs and to support development of viable applications on DLT.