Soteris Demetriou

CR
h-index17
17papers
469citations
Novelty48%
AI Score41

17 Papers

CLJun 26, 2022
Data Augmentation for Dementia Detection in Spoken Language

Anna Hlédiková, Dominika Woszczyk, Alican Akman et al.

Dementia is a growing problem as our society ages, and detection methods are often invasive and expensive. Recent deep-learning techniques can offer a faster diagnosis and have shown promising results. However, they require large amounts of labelled data which is not easily available for the task of dementia detection. One effective solution to sparse data problems is data augmentation, though the exact methods need to be selected carefully. To date, there has been no empirical study of data augmentation on Alzheimer's disease (AD) datasets for NLP and speech processing. In this work, we investigate data augmentation techniques for the task of AD detection and perform an empirical evaluation of the different approaches on two kinds of models for both the text and audio domains. We use a transformer-based model for both domains, and SVM and Random Forest models for the text and audio domains, respectively. We generate additional samples using traditional as well as deep learning based methods and show that data augmentation improves performance for both the text- and audio-based models and that such results are comparable to state-of-the-art results on the popular ADReSS set, with carefully crafted architectures and features.

CVApr 29, 2022
Using 3D Shadows to Detect Object Hiding Attacks on Autonomous Vehicle Perception

Zhongyuan Hau, Soteris Demetriou, Emil C. Lupu

Autonomous Vehicles (AVs) are mostly reliant on LiDAR sensors which enable spatial perception of their surroundings and help make driving decisions. Recent works demonstrated attacks that aim to hide objects from AV perception, which can result in severe consequences. 3D shadows, are regions void of measurements in 3D point clouds which arise from occlusions of objects in a scene. 3D shadows were proposed as a physical invariant valuable for detecting spoofed or fake objects. In this work, we leverage 3D shadows to locate obstacles that are hidden from object detectors. We achieve this by searching for void regions and locating the obstacles that cause these shadows. Our proposed methodology can be used to detect an object that has been hidden by an adversary as these objects, while hidden from 3D object detectors, still induce shadow artifacts in 3D point clouds, which we use for obstacle detection. We show that using 3D shadows for obstacle detection can achieve high accuracy in matching shadows to their object and provide precise prediction of an obstacle's distance from the ego-vehicle.

SDJul 3, 2024
Prosody-Driven Privacy-Preserving Dementia Detection

Dominika Woszczyk, Ranya Aloufi, Soteris Demetriou

Speaker embeddings extracted from voice recordings have been proven valuable for dementia detection. However, by their nature, these embeddings contain identifiable information which raises privacy concerns. In this work, we aim to anonymize embeddings while preserving the diagnostic utility for dementia detection. Previous studies rely on adversarial learning and models trained on the target attribute and struggle in limited-resource settings. We propose a novel approach that leverages domain knowledge to disentangle prosody features relevant to dementia from speaker embeddings without relying on a dementia classifier. Our experiments show the effectiveness of our approach in preserving speaker privacy (speaker recognition F1-score .01%) while maintaining high dementia detection score F1-score of 74% on the ADReSS dataset. Our results are also on par with a more constrained classifier-dependent system on ADReSSo (.01% and .66%), and have no impact on synthesized speech naturalness.

LGAug 12, 2025
Understanding Dementia Speech Alignment with Diffusion-Based Image Generation

Mansi, Anastasios Lepipas, Dominika Woszczyk et al.

Text-to-image models generate highly realistic images based on natural language descriptions and millions of users use them to create and share images online. While it is expected that such models can align input text and generated image in the same latent space little has been done to understand whether this alignment is possible between pathological speech and generated images. In this work, we examine the ability of such models to align dementia-related speech information with the generated images and develop methods to explain this alignment. Surprisingly, we found that dementia detection is possible from generated images alone achieving 75% accuracy on the ADReSS dataset. We then leverage explainability methods to show which parts of the language contribute to the detection.

CLJul 12, 2025
ClaritySpeech: Dementia Obfuscation in Speech

Dominika Woszczyk, Ranya Aloufi, Soteris Demetriou

Dementia, a neurodegenerative disease, alters speech patterns, creating communication barriers and raising privacy concerns. Current speech technologies, such as automatic speech transcription (ASR), struggle with dementia and atypical speech, further challenging accessibility. This paper presents a novel dementia obfuscation in speech framework, ClaritySpeech, integrating ASR, text obfuscation, and zero-shot text-to-speech (TTS) to correct dementia-affected speech while preserving speaker identity in low-data environments without fine-tuning. Results show a 16% and 10% drop in mean F1 score across various adversarial settings and modalities (audio, text, fusion) for ADReSS and ADReSSo, respectively, maintaining 50% speaker similarity. We also find that our system improves WER (from 0.73 to 0.08 for ADReSS and 0.15 for ADReSSo) and speech quality from 1.65 to ~2.15, enhancing privacy and accessibility.

CVJun 1, 2024
Adversarial 3D Virtual Patches using Integrated Gradients

Chengzeng You, Zhongyuan Hau, Binbin Xu et al.

LiDAR sensors are widely used in autonomous vehicles to better perceive the environment. However, prior works have shown that LiDAR signals can be spoofed to hide real objects from 3D object detectors. This study explores the feasibility of reducing the required spoofing area through a novel object-hiding strategy based on virtual patches (VPs). We first manually design VPs (MVPs) and show that VP-focused attacks can achieve similar success rates with prior work but with a fraction of the required spoofing area. Then we design a framework Saliency-LiDAR (SALL), which can identify critical regions for LiDAR objects using Integrated Gradients. VPs crafted on critical regions (CVPs) reduce object detection recall by at least 15% compared to our baseline with an approximate 50% reduction in the spoofing area for vehicles of average size.

CROct 16, 2021
Characterizing Improper Input Validation Vulnerabilities of Mobile Crowdsourcing Services

Sojhal Ismail Khan, Dominika Woszczyk, Chengzeng You et al.

Mobile crowdsourcing services (MCS), enable fast and economical data acquisition at scale and find applications in a variety of domains. Prior work has shown that Foursquare and Waze (a location-based and a navigation MCS) are vulnerable to different kinds of data poisoning attacks. Such attacks can be upsetting and even dangerous especially when they are used to inject improper inputs to mislead users. However, to date, there is no comprehensive study on the extent of improper input validation (IIV) vulnerabilities and the feasibility of their exploits in MCSs across domains. In this work, we leverage the fact that MCS interface with their participants through mobile apps to design tools and new methodologies embodied in an end-to-end feedback-driven analysis framework which we use to study 10 popular and previously unexplored services in five different domains. Using our framework we send tens of thousands of API requests with automatically generated input values to characterize their IIV attack surface. Alarmingly, we found that most of them (8/10) suffer from grave IIV vulnerabilities which allow an adversary to launch data poisoning attacks at scale: 7400 spoofed API requests were successful in faking online posts for robberies, gunshots, and other dangerous incidents, faking fitness activities with supernatural speeds and distances among many others. Lastly, we discuss easy to implement and deploy mitigation strategies which can greatly reduce the IIV attack surface and argue for their use as a necessary complementary measure working toward trustworthy mobile crowdsourcing services.

CRJun 27, 2021
Open, Sesame! Introducing Access Control to Voice Services

Dominika Woszczyk, Alvin Lee, Soteris Demetriou

Personal voice assistants (VAs) are shown to be vulnerable against record-and-replay, and other acoustic attacks which allow an adversary to gain unauthorized control of connected devices within a smart home. Existing defenses either lack detection and management capabilities or are too coarse-grained to enable flexible policies on par with other computing interfaces. In this work, we present Sesame, a lightweight framework for edge devices which is the first to enable fine-grained access control of smart-home voice commands. Sesame combines three components: Automatic Speech Recognition, Natural Language Understanding (NLU) and a Policy module. We implemented Sesame on Android devices and demonstrate that our system can enforce security policies for both Alexa and Google Home in real-time (362ms end-to-end inference time), with a lightweight (<25MB) NLU model which exhibits minimal accuracy loss compared to its non-compact equivalent.

CRJun 15, 2021
Temporal Consistency Checks to Detect LiDAR Spoofing Attacks on Autonomous Vehicle Perception

Chengzeng You, Zhongyuan Hau, Soteris Demetriou

LiDAR sensors are used widely in Autonomous Vehicles for better perceiving the environment which enables safer driving decisions. Recent work has demonstrated serious LiDAR spoofing attacks with alarming consequences. In particular, model-level LiDAR spoofing attacks aim to inject fake depth measurements to elicit ghost objects that are erroneously detected by 3D Object Detectors, resulting in hazardous driving decisions. In this work, we explore the use of motion as a physical invariant of genuine objects for detecting such attacks. Based on this, we propose a general methodology, 3D Temporal Consistency Check (3D-TC2), which leverages spatio-temporal information from motion prediction to verify objects detected by 3D Object Detectors. Our preliminary design and implementation of a 3D-TC2 prototype demonstrates very promising performance, providing more than 98% attack detection rate with a recall of 91% for detecting spoofed Vehicle (Car) objects, and is able to achieve real-time detection at 41Hz

LGMay 28, 2021
Quantifying and Localizing Usable Information Leakage from Neural Network Gradients

Fan Mo, Anastasia Borovykh, Mohammad Malekzadeh et al.

In collaborative learning, clients keep their data private and communicate only the computed gradients of the deep neural network being trained on their local data. Several recent attacks show that one can still extract private information from the shared network's gradients compromising clients' privacy. In this paper, to quantify the private information leakage from gradients we adopt usable information theory. We focus on two types of private information: original information in data reconstruction attacks and latent information in attribute inference attacks. Furthermore, a sensitivity analysis over the gradients is performed to explore the underlying cause of information leakage and validate the results of the proposed framework. Finally, we conduct numerical evaluations on six benchmark datasets and four well-known deep models. We measure the impact of training hyperparameters, e.g., batches and epochs, as well as potential defense mechanisms, e.g., dropout and differential privacy. Our proposed framework enables clients to localize and quantify the private information leakage in a layer-wise manner, and enables a better understanding of the sources of information leakage in collaborative learning, which can be used by future studies to benchmark new attacks and defense mechanisms.

CVFeb 7, 2021
Object Removal Attacks on LiDAR-based 3D Object Detectors

Zhongyuan Hau, Kenneth T. Co, Soteris Demetriou et al.

LiDARs play a critical role in Autonomous Vehicles' (AVs) perception and their safe operations. Recent works have demonstrated that it is possible to spoof LiDAR return signals to elicit fake objects. In this work we demonstrate how the same physical capabilities can be used to mount a new, even more dangerous class of attacks, namely Object Removal Attacks (ORAs). ORAs aim to force 3D object detectors to fail. We leverage the default setting of LiDARs that record a single return signal per direction to perturb point clouds in the region of interest (RoI) of 3D objects. By injecting illegitimate points behind the target object, we effectively shift points away from the target objects' RoIs. Our initial results using a simple random point selection strategy show that the attack is effective in degrading the performance of commonly used 3D object detection models.

CROct 17, 2020
Layer-wise Characterization of Latent Information Leakage in Federated Learning

Fan Mo, Anastasia Borovykh, Mohammad Malekzadeh et al.

Training deep neural networks via federated learning allows clients to share, instead of the original data, only the model trained on their data. Prior work has demonstrated that in practice a client's private information, unrelated to the main learning task, can be discovered from the model's gradients, which compromises the promised privacy protection. However, there is still no formal approach for quantifying the leakage of private information via the shared updated model or gradients. In this work, we analyze property inference attacks and define two metrics based on (i) an adaptation of the empirical $\mathcal{V}$-information, and (ii) a sensitivity analysis using Jacobian matrices allowing us to measure changes in the gradients with respect to latent information. We show the applicability of our proposed metrics in localizing private latent information in a layer-wise manner and in two settings where (i) we have or (ii) we do not have knowledge of the attackers' capabilities. We evaluate the proposed metrics for quantifying information leakage on three real-world datasets using three benchmark models.

CRAug 27, 2020
Shadow-Catcher: Looking Into Shadows to Detect Ghost Objects in Autonomous Vehicle 3D Sensing

Zhongyuan Hau, Soteris Demetriou, Luis Muñoz-González et al.

LiDAR-driven 3D sensing allows new generations of vehicles to achieve advanced levels of situation awareness. However, recent works have demonstrated that physical adversaries can spoof LiDAR return signals and deceive 3D object detectors to erroneously detect "ghost" objects. Existing defenses are either impractical or focus only on vehicles. Unfortunately, it is easier to spoof smaller objects such as pedestrians and cyclists, but harder to defend against and can have worse safety implications. To address this gap, we introduce Shadow-Catcher, a set of new techniques embodied in an end-to-end prototype to detect both large and small ghost object attacks on 3D detectors. We characterize a new semantically meaningful physical invariant (3D shadows) which Shadow-Catcher leverages for validating objects. Our evaluation on the KITTI dataset shows that Shadow-Catcher consistently achieves more than 94% accuracy in identifying anomalous shadows for vehicles, pedestrians, and cyclists, while it remains robust to a novel class of strong "invalidation" attacks targeting the defense system. Shadow-Catcher can achieve real-time detection, requiring only between 0.003s-0.021s on average to process an object in a 3D point cloud on commodity hardware and achieves a 2.17x speedup compared to prior work

CRJul 3, 2020
Smartphone Security Behavioral Scale: A New Psychometric Measurement for Smartphone Security

Hsiao-Ying Huang, Soteris Demetriou, Rini Banerjee et al.

Despite widespread use of smartphones, there is no measurement standard targeted at smartphone security behaviors. In this paper we translate a well-known cybersecurity behavioral scale into the smartphone domain and show that we can improve on this translation by following an established psychometrics approach surveying 1011 participants. We design a new 14-item Smartphone Security Behavioral Scale (SSBS) exhibiting high reliability and good fit to a two-component behavioural model based on technical versus social protection strategies. We then demonstrate how SSBS can be applied to measure the influence of mental health issues on smartphone security behavior intentions. We found significant correlations that predict SSBS profiles from three types of MHIs. Conversely, we are able to predict presence of MHIs using SSBS profiles.We obtain prediction AUCs of 72.1% for Internet addiction,75.8% for depression and 66.2% for insomnia.

LGApr 12, 2020
DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments

Fan Mo, Ali Shahin Shamsabadi, Kleomenis Katevas et al.

We present DarkneTZ, a framework that uses an edge device's Trusted Execution Environment (TEE) in conjunction with model partitioning to limit the attack surface against Deep Neural Networks (DNNs). Increasingly, edge devices (smartphones and consumer IoT devices) are equipped with pre-trained DNNs for a variety of applications. This trend comes with privacy risks as models can leak information about their training data through effective membership inference attacks (MIAs). We evaluate the performance of DarkneTZ, including CPU execution time, memory usage, and accurate power consumption, using two small and six large image classification models. Due to the limited memory of the edge device's TEE, we partition model layers into more sensitive layers (to be executed inside the device TEE), and a set of layers to be executed in the untrusted part of the operating system. Our results show that even if a single layer is hidden, we can provide reliable model privacy and defend against state of the art MIAs, with only 3% performance overhead. When fully utilizing the TEE, DarkneTZ provides model protections with up to 10% overhead.

CRMar 28, 2017
Understanding IoT Security Through the Data Crystal Ball: Where We Are Now and Where We Are Going to Be

Nan Zhang, Soteris Demetriou, Xianghang Mi et al.

Inspired by the boom of the consumer IoT market, many device manufacturers, start-up companies and technology giants have jumped into the space. Unfortunately, the exciting utility and rapid marketization of IoT, come at the expense of privacy and security. Industry reports and academic work have revealed many attacks on IoT systems, resulting in privacy leakage, property loss and large-scale availability problems. To mitigate such threats, a few solutions have been proposed. However, it is still less clear what are the impacts they can have on the IoT ecosystem. In this work, we aim to perform a comprehensive study on reported attacks and defenses in the realm of IoT aiming to find out what we know, where the current studies fall short and how to move forward. To this end, we first build a toolkit that searches through massive amount of online data using semantic analysis to identify over 3000 IoT-related articles. Further, by clustering such collected data using machine learning technologies, we are able to compare academic views with the findings from industry and other sources, in an attempt to understand the gaps between them, the trend of the IoT security risks and new problems that need further attention. We systemize this process, by proposing a taxonomy for the IoT ecosystem and organizing IoT security into five problem areas. We use this taxonomy as a beacon to assess each IoT work across a number of properties we define. Our assessment reveals that relevant security and privacy problems are far from solved. We discuss how each proposed solution can be applied to a problem area and highlight their strengths, assumptions and constraints. We stress the need for a security framework for IoT vendors and discuss the trend of shifting security liability to external or centralized entities. We also identify open research problems and provide suggestions towards a secure IoT ecosystem.

CRMar 4, 2017
Guardian of the HAN: Thwarting Mobile Attacks on Smart-Home Devices Using OS-level Situation Awareness

Soteris Demetriou, Nan Zhang, Yeonjoon Lee et al.

A new development of smart-home systems is to use mobile apps to control IoT devices across a Home Area Network (HAN). Those systems tend to rely on the Wi-Fi router to authenticate other devices; as verified in our study, IoT vendors tend to trust all devices connected to the HAN. This treatment exposes them to the attack from malicious apps, particularly those running on authorized phones, which the router does not have information to control, as confirmed in our measurement study. Mitigating this threat cannot solely rely on IoT manufacturers, which may need to change the hardware on the devices to support encryption, increasing the cost of the device, or software developers who we need to trust to implement security correctly. In this work, we present a new technique to control the communication between the IoT devices and their apps in a unified, backward-compatible way. Our approach, called Hanguard, does not require any changes to the IoT devices themselves, the IoT apps or the OS of the participating phones. Hanguard achieves a fine-grained, per-app protection through bridging the OS-level situation awareness and the router-level per-flow control: each phone runs a non-system userspace Monitor app to identify the party that attempts to access the protected IoT device and inform the router through a control plane of its access decision; the router enforces the decision on the data plane after verifying whether the phone should be allowed to talk to the device. Hanguard uses a role-based access control (RBAC) schema which leverages type enforcement (TE) and multi-category security (MCS) primitives to define highly flexible access control rules. We implemented our design over both Android and iOS (>95% of mobile OS market share) and a popular router. Our study shows that Hanguard is both efficient and effective in practice.