Thomas Schneider

23papers

1,411citations

Novelty41%

AI Score30

Ranked #148,084 of 201,326 authors (top 74%)#4,164 in CR (top 57%)

23 Papers

CROct 13, 2022

ScionFL: Efficient and Robust Secure Quantized Aggregation

Yaniv Ben-Itzhak, Helen Möllering, Benny Pinkas et al.

Secure aggregation is commonly used in federated learning (FL) to alleviate privacy concerns related to the central aggregator seeing all parameter updates in the clear. Unfortunately, most existing secure aggregation schemes ignore two critical orthogonal research directions that aim to (i) significantly reduce client-server communication and (ii) mitigate the impact of malicious clients. However, both of these additional properties are essential to facilitate cross-device FL with thousands or even millions of (mobile) participants. In this paper, we unite both research directions by introducing ScionFL, the first secure aggregation framework for FL that operates efficiently on quantized inputs and simultaneously provides robustness against malicious clients. Our framework leverages (novel) multi-party computation (MPC) techniques and supports multiple linear (1-bit) quantization schemes, including ones that utilize the randomized Hadamard transform and Kashin's representation. Our theoretical results are supported by extensive evaluations. We show that with no overhead for clients and moderate overhead for the server compared to transferring and processing quantized updates in plaintext, we obtain comparable accuracy for standard FL benchmarks. Moreover, we demonstrate the robustness of our framework against state-of-the-art poisoning attacks.

CRSep 30, 2024

Comments on "Privacy-Enhanced Federated Learning Against Poisoning Adversaries"

Thomas Schneider, Ajith Suresh, Hossein Yalame

In August 2021, Liu et al. (IEEE TIFS'21) proposed a privacy-enhanced framework named PEFL to efficiently detect poisoning behaviours in Federated Learning (FL) using homomorphic encryption. In this article, we show that PEFL does not preserve privacy. In particular, we illustrate that PEFL reveals the entire gradient vector of all users in clear to one of the participating entities, thereby violating privacy. Furthermore, we clearly show that an immediate fix for this issue is still insufficient to achieve privacy by pointing out multiple flaws in the proposed system. Note: Although our privacy issues mentioned in Section II have been published in January 2023 (Schneider et. al., IEEE TIFS'23), several subsequent papers continued to reference Liu et al. (IEEE TIFS'21) as a potential solution for private federated learning. While a few works have acknowledged the privacy concerns we raised, several of subsequent works either propagate these errors or adopt the constructions from Liu et al. (IEEE TIFS'21), thereby unintentionally inheriting the same privacy vulnerabilities. We believe this oversight is partly due to the limited visibility of our comments paper at TIFS'23 (Schneider et. al., IEEE TIFS'23). Consequently, to prevent the continued propagation of the flawed algorithms in Liu et al. (IEEE TIFS'21) into future research, we also put this article to an ePrint.

CRAug 18, 2023

Attesting Distributional Properties of Training Data for Machine Learning

Vasisht Duddu, Anudeep Das, Nora Khayata et al.

The success of machine learning (ML) has been accompanied by increased concerns about its trustworthiness. Several jurisdictions are preparing ML regulatory frameworks. One such concern is ensuring that model training data has desirable distributional properties for certain sensitive attributes. For example, draft regulations indicate that model trainers are required to show that training datasets have specific distributional properties, such as reflecting diversity of the population. We propose the notion of property attestation allowing a prover (e.g., model trainer) to demonstrate relevant distributional properties of training data to a verifier (e.g., a customer) without revealing the data. We present an effective hybrid property attestation combining property inference with cryptographic mechanisms.

LGFeb 20, 2023

WW-FL: Secure and Private Large-Scale Federated Learning

Felix Marx, Thomas Schneider, Ajith Suresh et al.

Federated learning (FL) is an efficient approach for large-scale distributed machine learning that promises data privacy by keeping training data on client devices. However, recent research has uncovered vulnerabilities in FL, impacting both security and privacy through poisoning attacks and the potential disclosure of sensitive information in individual model updates as well as the aggregated global model. This paper explores the inadequacies of existing FL protection measures when applied independently, and the challenges of creating effective compositions. Addressing these issues, we propose WW-FL, an innovative framework that combines secure multi-party computation (MPC) with hierarchical FL to guarantee data and global model privacy. One notable feature of WW-FL is its capability to prevent malicious clients from directly poisoning model parameters, confining them to less destructive data poisoning attacks. We furthermore provide a PyTorch-based FL implementation integrated with Meta's CrypTen MPC framework to systematically measure the performance and robustness of WW-FL. Our extensive evaluation demonstrates that WW-FL is a promising solution for secure and private large-scale federated learning.

ROMay 28, 2018Code

Long-term Large-scale Mapping and Localization Using maplab

Marcin Dymczyk, Marius Fehr, Thomas Schneider et al.

This paper discusses a large-scale and long-term mapping and localization scenario using the maplab open-source framework. We present a brief overview of the specific algorithms in the system that enable building a consistent map from multiple sessions. We then demonstrate that such a map can be reused even a few months later for efficient 6-DoF localization and also new trajectories can be registered within the existing 3D model. The datasets presented in this paper are made publicly available.

CRJun 8, 2021

LaserShark: Establishing Fast, Bidirectional Communication into Air-Gapped Systems

Niclas Kühnapfel, Stefan Preußler, Maximilian Noppel et al.

Physical isolation, so called air-gapping, is an effective method for protecting security-critical computers and networks. While it might be possible to introduce malicious code through the supply chain, insider attacks, or social engineering, communicating with the outside world is prevented. Different approaches to breach this essential line of defense have been developed based on electromagnetic, acoustic, and optical communication channels. However, all of these approaches are limited in either data rate or distance, and frequently offer only exfiltration of data. We present a novel approach to infiltrate data to and exfiltrate data from air-gapped systems without any additional hardware on-site. By aiming lasers at already built-in LEDs and recording their response, we are the first to enable a long-distance (25m), bidirectional, and fast (18.2kbps in & 100kbps out) covert communication channel. The approach can be used against any office device that operates LEDs at the CPU's GPIO interface.

CRJan 6, 2021

FLAME: Taming Backdoors in Federated Learning (Extended Version 1)

Thien Duc Nguyen, Phillip Rieger, Huili Chen et al.

Federated Learning (FL) is a collaborative machine learning approach allowing participants to jointly train a model without having to share their private, potentially sensitive local datasets with others. Despite its benefits, FL is vulnerable to backdoor attacks, in which an adversary injects manipulated model updates into the model aggregation process so that the resulting model will provide targeted false predictions for specific adversary-chosen inputs. Proposed defenses against backdoor attacks based on detecting and filtering out malicious model updates consider only very specific and limited attacker models, whereas defenses based on differential privacy-inspired noise injection significantly deteriorate the benign performance of the aggregated model. To address these deficiencies, we introduce FLAME, a defense framework that estimates the sufficient amount of noise to be injected to ensure the elimination of backdoors while maintaining the model performance. To minimize the required amount of noise, FLAME uses a model clustering and weight clipping approach. Our evaluation of FLAME on several datasets stemming from application areas including image classification, word prediction, and IoT intrusion detection demonstrates that FLAME removes backdoors effectively with a negligible impact on the benign performance of the models. Furthermore, following the considerable attention that our research has received after its presentation at USENIX SEC 2022, FLAME has become the subject of numerous investigations proposing diverse attack methodologies in an attempt to circumvent it. As a response to these endeavors, we provide a comprehensive analysis of these attempts. Our findings show that these papers (e.g., 3DFed [36]) have not fully comprehended nor correctly employed the fundamental principles underlying FLAME, i.e., our defense mechanism effectively repels these attempted attacks.

AINov 19, 2020

Conservative Extensions in Horn Description Logics with Inverse Roles

Jean Christoph Jung, Carsten Lutz, Mauricio Martel et al.

We investigate the decidability and computational complexity of conservative extensions and the related notions of inseparability and entailment in Horn description logics (DLs) with inverse roles. We consider both query conservative extensions, defined by requiring that the answers to all conjunctive queries are left unchanged, and deductive conservative extensions, which require that the entailed concept inclusions, role inclusions, and functionality assertions do not change. Upper bounds for query conservative extensions are particularly challenging because characterizations in terms of unbounded homomorphisms between universal models, which are the foundation of the standard approach to establishing decidability, fail in the presence of inverse roles. We resort to a characterization that carefully mixes unbounded and bounded homomorphisms and enables a decision procedure that combines tree automata and a mosaic technique. Our main results are that query conservative extensions are 2ExpTime-complete in all DLs between ELI and Horn-ALCHIF and between Horn-ALC and Horn-ALCHIF, and that deductive conservative extensions are 2ExpTime-complete in all DLs between ELI and ELHIF_\bot. The same results hold for inseparability and entailment.

CRAug 10, 2020

Trustworthy AI Inference Systems: An Industry Research View

Rosario Cammarota, Matthias Schunter, Anand Rajan et al.

In this work, we provide an industry research view for approaching the design, deployment, and operation of trustworthy Artificial Intelligence (AI) inference systems. Such systems provide customers with timely, informed, and customized inferences to aid their decision, while at the same time utilizing appropriate security protection mechanisms for AI models. Additionally, such systems should also use Privacy-Enhancing Technologies (PETs) to protect customers' data at any time. To approach the subject, we start by introducing current trends in AI inference systems. We continue by elaborating on the relationship between Intellectual Property (IP) and private data protection in such systems. Regarding the protection mechanisms, we survey the security and privacy building blocks instrumental in designing, building, deploying, and operating private AI inference systems. For example, we highlight opportunities and challenges in AI systems using trusted execution environments combined with more recent advances in cryptographic techniques to protect data in use. Finally, we outline areas of further development that require the global collective attention of industry, academia, and government researchers to sustain the operation of trustworthy AI inference systems.

CRJul 5, 2020

Offline Model Guard: Secure and Private ML on Mobile Devices

Sebastian P. Bayerl, Tommaso Frassetto, Patrick Jauernig et al.

Performing machine learning tasks in mobile applications yields a challenging conflict of interest: highly sensitive client information (e.g., speech data) should remain private while also the intellectual property of service providers (e.g., model parameters) must be protected. Cryptographic techniques offer secure solutions for this, but have an unacceptable overhead and moreover require frequent network interaction. In this work, we design a practically efficient hardware-based solution. Specifically, we build Offline Model Guard (OMG) to enable privacy-preserving machine learning on the predominant mobile computing platform ARM - even in offline scenarios. By leveraging a trusted execution environment for strict hardware-enforced isolation from other system components, OMG guarantees privacy of client data, secrecy of provided models, and integrity of processing algorithms. Our prototype implementation on an ARM HiKey 960 development board performs privacy-preserving keyword recognition using TensorFlow Lite for Microcontrollers in real time.

AIFeb 3, 2020

CryptoSPN: Privacy-preserving Sum-Product Network Inference

Amos Treiber, Alejandro Molina, Christian Weinert et al.

AI algorithms, and machine learning (ML) techniques in particular, are increasingly important to individuals' lives, but have caused a range of privacy concerns addressed by, e.g., the European GDPR. Using cryptographic techniques, it is possible to perform inference tasks remotely on sensitive client data in a privacy-preserving way: the server learns nothing about the input data and the model predictions, while the client learns nothing about the ML model (which is often considered intellectual property and might contain traces of sensitive data). While such privacy-preserving solutions are relatively efficient, they are mostly targeted at neural networks, can degrade the predictive accuracy, and usually reveal the network's topology. Furthermore, existing solutions are not readily accessible to ML experts, as prototype implementations are not well-integrated into ML frameworks and require extensive cryptographic knowledge. In this paper, we present CryptoSPN, a framework for privacy-preserving inference of sum-product networks (SPNs). SPNs are a tractable probabilistic graphical model that allows a range of exact inference queries in linear time. Specifically, we show how to efficiently perform SPN inference via secure multi-party computation (SMPC) without accuracy degradation while hiding sensitive client and training information with provable security guarantees. Next to foundations, CryptoSPN encompasses tools to easily transform existing SPNs into privacy-preserving executables. Our empirical results demonstrate that CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.

ASJul 8, 2019

Privacy-Preserving Speaker Recognition with Cohort Score Normalisation

Andreas Nautsch, Jose Patino, Amos Treiber et al.

In many voice biometrics applications there is a requirement to preserve privacy, not least because of the recently enforced General Data Protection Regulation (GDPR). Though progress in bringing privacy preservation to voice biometrics is lagging behind developments in other biometrics communities, recent years have seen rapid progress, with secure computation mechanisms such as homomorphic encryption being applied successfully to speaker recognition. Even so, the computational overhead incurred by processing speech data in the encrypted domain is substantial. While still tolerable for single biometric comparisons, most state-of-the-art systems perform some form of cohort-based score normalisation, requiring many thousands of biometric comparisons. The computational overhead is then prohibitive, meaning that one must accept either degraded performance (no score normalisation) or potential for privacy violations. This paper proposes the first computationally feasible approach to privacy-preserving cohort score normalisation. Our solution is a cohort pruning scheme based on secure multi-party computation which enables privacy-preserving score normalisation using probabilistic linear discriminant analysis (PLDA) comparisons. The solution operates upon binary voice representations. While the binarisation is lossy in biometric rank-1 performance, it supports computationally-feasible biometric rank-n comparisons in the encrypted domain.

CRJun 11, 2019

A Comment on Privacy-Preserving Scalar Product Protocols as proposed in "SPOC"

Thomas Schneider, Amos Treiber

Privacy-preserving scalar product (PPSP) protocols are an important building block for secure computation tasks in various applications. Lu et al. (TPDS'13) introduced a PPSP protocol that does not rely on cryptographic assumptions and that is used in a wide range of publications to date. In this comment paper, we show that Lu et al.'s protocol is insecure and should not be used. We describe specific attacks against it and, using impossibility results of Impagliazzo and Rudich (STOC'89), show that it is inherently insecure and cannot be fixed without relying on at least some cryptographic assumptions.

ROApr 1, 2019

Experimental Comparison of Visual-Aided Odometry Methods for Rail Vehicles

Florian Tschopp, Thomas Schneider, Andrew W. Palmer et al.

Today, rail vehicle localization is based on infrastructure-side Balises (beacons) together with on-board odometry to determine whether a rail segment is occupied. Such a coarse locking leads to a sub-optimal usage of the rail networks. New railway standards propose the use of moving blocks centered around the rail vehicles to increase the capacity of the network. However, this approach requires accurate and robust position and velocity estimation of all vehicles. In this work, we investigate the applicability, challenges and limitations of current visual and visual-inertial motion estimation frameworks for rail applications. An evaluation against RTK-GPS ground truth is performed on multiple datasets recorded in industrial, sub-urban, and forest environments. Our results show that stereo visual-inertial odometry has a great potential to provide a precise motion estimation because of its complementing sensor modalities and shows superior performance in challenging situations compared to other frameworks.

ROJan 22, 2019

Observability-aware Self-Calibration of Visual and Inertial Sensors for Ego-Motion Estimation

Thomas Schneider, Mingyang Li, Cesar Cadena et al.

External effects such as shocks and temperature variations affect the calibration of visual-inertial sensor systems and thus they cannot fully rely on factory calibrations. Re-calibrations performed on short user-collected datasets might yield poor performance since the observability of certain parameters is highly dependent on the motion. Additionally, on resource-constrained systems (e.g mobile phones), full-batch approaches over longer sessions quickly become prohibitively expensive. In this paper, we approach the self-calibration problem by introducing information theoretic metrics to assess the information content of trajectory segments, thus allowing to select the most informative parts from a dataset for calibration purposes. With this approach, we are able to build compact calibration datasets either: (a) by selecting segments from a long session with limited exciting motion or (b) from multiple short sessions where a single sessions does not necessarily excite all modes sufficiently. Real-world experiments in four different environments show that the proposed method achieves comparable performance to a batch calibration approach, yet, at a constant computational complexity which is independent of the duration of the session.

ROMar 26, 2018

Visual-Inertial Teach and Repeat for Aerial Inspection

Marius Fehr, Thomas Schneider, Marcin Dymczyk et al.

Industrial facilities often require periodic visual inspections of key installations. Examining these points of interest is time consuming, potentially hazardous or require special equipment to reach. MAVs are ideal platforms to automate this expensive and tedious task. In this work we present a novel system that enables a human operator to teach a visual inspection task to an autonomous aerial vehicle by simply demonstrating the task using a handheld device. To enable robust operation in confined, GPS-denied environments, the system employs the Google Tango visual-inertial mapping framework as the only source of pose estimates. In a first step the operator records the desired inspection path and defines the inspection points. The mapping framework then computes a feature-based localization map, which is shared with the robot. After take-off, the robot estimates its pose based on this map and plans a smooth trajectory through the way points defined by the operator. Furthermore, the system is able to track the poses of other robots or the operator, localized in the same map, and follow them in real-time while keeping a safe distance.

CRJan 10, 2018

Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications

M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko et al.

We present Chameleon, a novel hybrid (mixed-protocol) framework for secure function evaluation (SFE) which enables two parties to jointly compute a function without disclosing their private inputs. Chameleon combines the best aspects of generic SFE protocols with the ones that are based upon additive secret sharing. In particular, the framework performs linear operations in the ring $\mathbb{Z}_{2^l}$ using additively secret shared values and nonlinear operations using Yao's Garbled Circuits or the Goldreich-Micali-Wigderson protocol. Chameleon departs from the common assumption of additive or linear secret sharing models where three or more parties need to communicate in the online phase: the framework allows two parties with private inputs to communicate in the online phase under the assumption of a third node generating correlated randomness in an offline phase. Almost all of the heavy cryptographic operations are precomputed in an offline phase which substantially reduces the communication overhead. Chameleon is both scalable and significantly more efficient than the ABY framework (NDSS'15) it is based on. Our framework supports signed fixed-point numbers. In particular, Chameleon's vector dot product of signed fixed-point numbers improves the efficiency of mining and classification of encrypted data for algorithms based upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer convolutional deep neural network shows 133x and 4.2x faster executions than Microsoft CryptoNets (ICML'16) and MiniONN (CCS'17), respectively.

RONov 28, 2017

maplab: An Open Framework for Research in Visual-inertial Mapping and Localization

Thomas Schneider, Marcin Dymczyk, Marius Fehr et al.

Robust and accurate visual-inertial estimation is crucial to many of today's challenges in robotics. Being able to localize against a prior map and obtain accurate and driftfree pose estimates can push the applicability of such systems even further. Most of the currently available solutions, however, either focus on a single session use-case, lack localization capabilities or an end-to-end pipeline. We believe that only a complete system, combining state-of-the-art algorithms, scalable multi-session mapping tools, and a flexible user interface, can become an efficient research platform. We therefore present maplab, an open, research-oriented visual-inertial mapping framework for processing and manipulating multi-session maps, written in C++. On the one hand, maplab can be seen as a ready-to-use visual-inertial mapping and localization system. On the other hand, maplab provides the research community with a collection of multisession mapping tools that include map merging, visual-inertial batch optimization, and loop closure. Furthermore, it includes an online frontend that can create visual-inertial maps and also track a global drift-free pose within a localization map. In this paper, we present the system architecture, five use-cases, and evaluations of the system on public datasets. The source code of maplab is freely available for the benefit of the robotics research community.

ROSep 16, 2017

Topomap: Topological Mapping and Navigation Based on Visual SLAM Maps

Fabian Blöchliger, Marius Fehr, Marcin Dymczyk et al.

Visual robot navigation within large-scale, semi-structured environments deals with various challenges such as computation intensive path planning algorithms or insufficient knowledge about traversable spaces. Moreover, many state-of-the-art navigation approaches only operate locally instead of gaining a more conceptual understanding of the planning objective. This limits the complexity of tasks a robot can accomplish and makes it harder to deal with uncertainties that are present in the context of real-time robotics applications. In this work, we present Topomap, a framework which simplifies the navigation task by providing a map to the robot which is tailored for path planning use. This novel approach transforms a sparse feature-based map from a visual Simultaneous Localization And Mapping (SLAM) system into a three-dimensional topological map. This is done in two steps. First, we extract occupancy information directly from the noisy sparse point cloud. Then, we create a set of convex free-space clusters, which are the vertices of the topological map. We show that this representation improves the efficiency of global planning, and we provide a complete derivation of our algorithm. Planning experiments on real world datasets demonstrate that we achieve similar performance as RRT* with significantly lower computation times and storage requirements. Finally, we test our algorithm on a mobile robotic platform to prove its advantages.

ROAug 8, 2017

Visual-inertial self-calibration on informative motion segments

Thomas Schneider, Mingyang Li, Michael Burri et al.

Environmental conditions and external effects, such as shocks, have a significant impact on the calibration parameters of visual-inertial sensor systems. Thus long-term operation of these systems cannot fully rely on factory calibration. Since the observability of certain parameters is highly dependent on the motion of the device, using short data segments at device initialization may yield poor results. When such systems are additionally subject to energy constraints, it is also infeasible to use full-batch approaches on a big dataset and careful selection of the data is of high importance. In this paper, we present a novel approach for resource efficient self-calibration of visual-inertial sensor systems. This is achieved by casting the calibration as a segment-based optimization problem that can be run on a small subset of informative segments. Consequently, the computational burden is limited as only a predefined number of segments is used. We also propose an efficient information-theoretic selection to identify such informative motion segments. In evaluations on a challenging dataset, we show our approach to significantly outperform state-of-the-art in terms of computational burden while maintaining a comparable accuracy.

AIJun 1, 2016

A Survey of Qualitative Spatial and Temporal Calculi -- Algebraic and Computational Properties

Frank Dylla, Jae Hee Lee, Till Mossakowski et al.

Qualitative Spatial and Temporal Reasoning (QSTR) is concerned with symbolic knowledge representation, typically over infinite domains. The motivations for employing QSTR techniques range from exploiting computational properties that allow efficient reasoning to capture human cognitive concepts in a computational framework. The notion of a qualitative calculus is one of the most prominent QSTR formalisms. This article presents the first overview of all qualitative calculi developed to date and their computational properties, together with generalized definitions of the fundamental concepts and methods, which now encompass all existing calculi. Moreover, we provide a classification of calculi according to their algebraic properties.

AIMay 31, 2013

Algebraic Properties of Qualitative Spatio-Temporal Calculi

Frank Dylla, Till Mossakowski, Thomas Schneider et al.

Qualitative spatial and temporal reasoning is based on so-called qualitative calculi. Algebraic properties of these calculi have several implications on reasoning algorithms. But what exactly is a qualitative calculus? And to which extent do the qualitative calculi proposed meet these demands? The literature provides various answers to the first question but only few facts about the second. In this paper we identify the minimal requirements to binary spatio-temporal calculi and we discuss the relevance of the according axioms for representation and reasoning. We also analyze existing qualitative calculi and provide a classification involving different notions of a relation algebra.

AIJul 6, 2012

Syntactic vs. Semantic Locality: How Good Is a Cheap Approximation?

Chiara Del Vescovo, Pavel Klinov, Bijan Parsia et al.

Extracting a subset of a given OWL ontology that captures all the ontology's knowledge about a specified set of terms is a well-understood task. This task can be based, for instance, on locality-based modules (LBMs). These come in two flavours, syntactic and semantic, and a syntactic LBM is known to contain the corresponding semantic LBM. For syntactic LBMs, polynomial extraction algorithms are known, implemented in the OWL API, and being used. In contrast, extracting semantic LBMs involves reasoning, which is intractable for OWL 2 DL, and these algorithms had not been implemented yet for expressive ontology languages. We present the first implementation of semantic LBMs and report on experiments that compare them with syntactic LBMs extracted from real-life ontologies. Our study reveals whether semantic LBMs are worth the additional extraction effort, compared with syntactic LBMs.