Erik Buchmann

CR
h-index4
8papers
23citations
Novelty27%
AI Score39

8 Papers

CRMay 6
Long-Term Risks of IoT Devices: The Case of the Smart Fridge

Erik Buchmann

Replacing conventional devices with smart ones has many advantages, e.g., a seamless integration of physical objects into the users digital environment or improved modes of use. However, if a conventional device is replaced by a smart device, its IT components can cause risks, that shorten the life of the device. Such risks stem from different life cycles of embedded soft- and hardware, libraries and protocols used, and the IT ecosystem required. This is problematic, because many conventional household appliances, say, a fridge or TV, have a much longer life span than typical IT equipment. In this paper, we use a systematic approach to identify long-term risks for the operational life span of a smart fridge. In particular, we identify 8 different use cases of three typical smart fridges, e.g., cooling or managing "best before" dates. We model the IT ecosystem needed to run these use cases, and we inspect each asset in this ecosystem for potential long-term risks. We found that even cooling, the most basic use case, is at risk in the long run. This is because the setting cooling parameters may depend on parts of the IT ecosystem that are not under the users control. On the other hand, we did not find any risk that may lead to harm of the category "threatening". Our findings on the smart fridge can be generalized to other smart devices easily.

LGJun 5, 2023
A Privacy-Preserving Federated Learning Approach for Kernel methods

Anika Hannemann, Ali Burak Ünal, Arjhun Swaminathan et al.

It is challenging to implement Kernel methods, if the data sources are distributed and cannot be joined at a trusted third party for privacy reasons. It is even more challenging, if the use case rules out privacy-preserving approaches that introduce noise. An example for such a use case is machine learning on clinical data. To realize exact privacy preserving computation of kernel methods, we propose FLAKE, a Federated Learning Approach for KErnel methods on horizontally distributed data. With FLAKE, the data sources mask their data so that a centralized instance can compute a Gram matrix without compromising privacy. The Gram matrix allows to calculate many kernel matrices, which can be used to train kernel-based machine learning algorithms such as Support Vector Machines. We prove that FLAKE prevents an adversary from learning the input data or the number of input features under a semi-honest threat model. Experiments on clinical and synthetic data confirm that FLAKE is outperforming the accuracy and efficiency of comparable methods. The time needed to mask the data and to compute the Gram matrix is several orders of magnitude less than the time a Support Vector Machine needs to be trained. Thus, FLAKE can be applied to many use cases.

CRMar 23
Cybersecurity Guidance for Smart Homes: A Cross-National Review of Government Sources

Victor Jüttner, Erik Buchmann

Smart homes are increasingly targeted by cyberattacks, yet residents often lack guidance when incidents occur. Since affected residents are likely to seek help from trustworthy sources, this paper asks: What actionable cybersecurity guidance do governments provide to smart home users whose systems have been compromised? To answer this question, we conduct an exploratory, user-centered review of governmental cybersecurity guidance for smart homes across eleven countries to identify and characterize the types of guidance governments provide and to systematize their content. Using a standardized search and screening process, we derive three emergent clusters: incident reporting, general security recommendations, and incident response. Our findings show that governments provide abundant general security advice and accessible reporting channels, but structured incident response guidance tailored to smart homes is rare. Only two sources offer step-by-step recovery guidance for non-expert users, highlighting a gap between preventive advice and post-incident support.

CRMay 4
Noisy Networks, Nosy Neighbors: Simple Privacy Attacks Against Residential Wireless Traffic

Arne Roszeitis, Bartosz Burgiel, Victor Jüttner et al.

Smart devices, such as light bulbs, TVs, fridges, etc., equipped with computing capabilities and wireless communication, are part of everyday life in many households. Previous work has already shown that a passive eavesdropper can derive private information, household routines, etc., from the network traffic of smart devices. However, existing attacks rely on capable adversaries with specialized machine learning expertise, labeled training data and reference devices, leaving it unclear how vulnerable ordinary households are to less sophisticated attackers. In this paper, we investigate the extent to which a ,,casual attacker'' with straightforward IT skills and no specialized cybersecurity or ML tooling can reproduce such privacy attacks. Operating from an adjacent room in a real-world apartment building, we constrain our adversary to use only three off-the-shelf Raspberry Pis, Wireshark, and basic Python scripts. Through a three-week study, we demonstrate that this casual attacker can manually identify devices, recognize user states, track smartphone movements through walls via RSSI triangulation, and successfully extract detailed daily routines, including sleep patterns of guests. Our findings show that smart-home privacy leakage is a threat even from low-resourced, straightforward adversaries, e.g., neighbors.

CYMar 12, 2024
Legally Binding but Unfair? Towards Assessing Fairness of Privacy Policies

Vincent Freiberger, Erik Buchmann

Privacy policies are expected to inform data subjects about their data protection rights and should explain the data controller's data management practices. Privacy policies only fulfill their purpose, if they are correctly interpreted, understood, and trusted by the data subject. This implies that a privacy policy is written in a fair way, e.g., it does not use polarizing terms, does not require a certain education, or does not assume a particular social background. We outline our approach to assessing fairness in privacy policies. We identify from fundamental legal sources and fairness research, how the dimensions informational fairness, representational fairness and ethics / morality are related to privacy policies. We propose options to automatically assess policies in these fairness dimensions, based on text statistics, linguistic methods and artificial intelligence. We conduct initial experiments with German privacy policies to provide evidence that our approach is applicable. Our experiments indicate that there are issues in all three dimensions of fairness. This is important, as future privacy policies may be used in a corpus for legal artificial intelligence models.

HCJan 27, 2025
PRISMe: A Novel LLM-Powered Tool for Interactive Privacy Policy Assessment

Vincent Freiberger, Arthur Fleig, Erik Buchmann

Protecting online privacy requires users to engage with and comprehend website privacy policies, but many policies are difficult and tedious to read. We present PRISMe (Privacy Risk Information Scanner for Me), a novel Large Language Model (LLM)-driven privacy policy assessment tool, which helps users to understand the essence of a lengthy, complex privacy policy while browsing. The tool, a browser extension, integrates a dashboard and an LLM chat. One major contribution is the first rigorous evaluation of such a tool. In a mixed-methods user study (N=22), we evaluate PRISMe's efficiency, usability, understandability of the provided information, and impacts on awareness. While our tool improves privacy awareness by providing a comprehensible quick overview and a quality chat for in-depth discussion, users note issues with consistency and building trust in the tool. From our insights, we derive important design implications to guide future policy analysis tools.

CLJan 2, 2024
Fairness Certification for Natural Language Processing and Large Language Models

Vincent Freiberger, Erik Buchmann

Natural Language Processing (NLP) plays an important role in our daily lives, particularly due to the enormous progress of Large Language Models (LLM). However, NLP has many fairness-critical use cases, e.g., as an expert system in recruitment or as an LLM-based tutor in education. Since NLP is based on human language, potentially harmful biases can diffuse into NLP systems and produce unfair results, discriminate against minorities or generate legal issues. Hence, it is important to develop a fairness certification for NLP approaches. We follow a qualitative research approach towards a fairness certification for NLP. In particular, we have reviewed a large body of literature on algorithmic fairness, and we have conducted semi-structured expert interviews with a wide range of experts from that area. We have systematically devised six fairness criteria for NLP, which can be further refined into 18 sub-categories. Our criteria offer a foundation for operationalizing and testing processes to certify fairness, both from the perspective of the auditor and the audited organization.

LGFeb 22, 2024
Federated Learning in Genetics: Extended Analysis of Accuracy, Performance and Privacy Trade-offs

Anika Hannemann, Jan Ewald, Leo Seeger et al.

Machine learning on large-scale genomic or transcriptomic data is important for many novel health applications. For example, precision medicine tailors medical treatments to patients on the basis of individual biomarkers, cellular and molecular states, etc. However, the data required is sensitive, voluminous, heterogeneous, and typically distributed across locations where dedicated machine learning hardware is not available. Due to privacy and regulatory reasons, it is also problematic to aggregate all data at a trusted third party. Federated learning is a promising solution to this dilemma, because it enables decentralized, collaborative machine learning without exchanging raw data. In this paper, we perform comparative experiments with the federated learning frameworks TensorFlow Federated and Flower. Our test case is the training of disease prognosis and cell type classification models. We train the models with distributed transcriptomic data, considering both data heterogeneity and architectural heterogeneity. We measure model quality, robustness against privacy-enhancing noise and computational performance. We evaluate the resource overhead of a federated system from both client and global perspectives and assess benefits and limitations. Each of the federated learning frameworks has different strengths. However, our experiments confirm that both frameworks can readily build models on transcriptomic data, without transferring personal raw data to a third party with abundant computational resources. This paper is the extended version of https://link.springer.com/chapter/10.1007/978-3-031-63772-8_26.