Ruby B. Lee

h-index35

10papers

299citations

Novelty52%

AI Score26

Ranked #161,249 of 194,257 authors (top 83%)#4,535 in CR (top 67%)

10 Papers

3.8CRAug 20, 2021

CloudShield: Real-time Anomaly Detection in the Cloud

Zecheng He, Ruby B. Lee

In cloud computing, it is desirable if suspicious activities can be detected by automatic anomaly detection systems. Although anomaly detection has been investigated in the past, it remains unsolved in cloud computing. Challenges are: characterizing the normal behavior of a cloud server, distinguishing between benign and malicious anomalies (attacks), and preventing alert fatigue due to false alarms. We propose CloudShield, a practical and generalizable real-time anomaly and attack detection system for cloud computing. Cloudshield uses a general, pretrained deep learning model with different cloud workloads, to predict the normal behavior and provide real-time and continuous detection by examining the model reconstruction error distributions. Once an anomaly is detected, to reduce alert fatigue, CloudShield automatically distinguishes between benign programs, known attacks, and zero-day attacks, by examining the prediction error distributions. We evaluate the proposed CloudShield on representative cloud benchmarks. Our evaluation shows that CloudShield, using model pretraining, can apply to a wide scope of cloud workloads. Especially, we observe that CloudShield can detect the recently proposed speculative execution attacks, e.g., Spectre and Meltdown attacks, in milliseconds. Furthermore, we show that CloudShield accurately differentiates and prioritizes known attacks, and potential zero-day attacks, from benign programs. Thus, it significantly reduces false alarms by up to 99.0%.

6.6CRMar 11, 2021

Smartphone Impostor Detection with Behavioral Data Privacy and Minimalist Hardware Support

Guangyuan Hu, Zecheng He, Ruby B. Lee

Impostors are attackers who take over a smartphone and gain access to the legitimate user's confidential and private information. This paper proposes a defense-in-depth mechanism to detect impostors quickly with simple Deep Learning algorithms, which can achieve better detection accuracy than the best prior work which used Machine Learning algorithms requiring computation of multiple features. Different from previous work, we then consider protecting the privacy of a user's behavioral (sensor) data by not exposing it outside the smartphone. For this scenario, we propose a Recurrent Neural Network (RNN) based Deep Learning algorithm that uses only the legitimate user's sensor data to learn his/her normal behavior. We propose to use Prediction Error Distribution (PED) to enhance the detection accuracy. We also show how a minimalist hardware module, dubbed SID for Smartphone Impostor Detector, can be designed and integrated into smartphones for self-contained impostor detection. Experimental results show that SID can support real-time impostor detection, at a very low hardware cost and energy consumption, compared to other RNN accelerators.

2.9CRFeb 10, 2020

Smartphone Impostor Detection with Built-in Sensors and Deep Learning

Guangyuan Hu, Zecheng He, Ruby Lee

In this paper, we show that sensor-based impostor detection with deep learning can achieve excellent impostor detection accuracy at lower hardware cost compared to past work on sensor-based user authentication (the inverse problem) which used more conventional machine learning algorithms. While these methods use other smartphone users' sensor data to build the (user, non-user) classification models, we go further to show that using only the legitimate user's sensor data can still achieve very good accuracy while preserving the privacy of the user's sensor data (behavioral biometrics). For this use case, a key contribution is showing that the detection accuracy of a Recurrent Neural Network (RNN) deep learning model can be significantly improved by comparing prediction error distributions. This requires generating and comparing empirical probability distributions, which we show in an efficient hardware design. Another novel contribution is in the design of SID (Smartphone impostor Detection), a minimalist hardware accelerator that can be integrated into future smartphones for efficient impostor detection for different scenarios. Our SID module can implement many common Machine Learning and Deep Learning algorithms. SID is also scalable in parallelism and performance and easy to program. We show an FPGA prototype of SID, which can provide more than enough performance for real-time impostor detection, with very low hardware complexity and power consumption (one to two orders of magnitude less than related performance-oriented FPGA accelerators). We also show that the FPGA implementation of SID consumes 64.41X less energy than an implementation using the CPU with a GPU.

14.7CRAug 9, 2018

VerIDeep: Verifying Integrity of Deep Neural Networks through Sensitive-Sample Fingerprinting

Zecheng He, Tianwei Zhang, Ruby B. Lee

Deep learning has become popular, and numerous cloud-based services are provided to help customers develop and deploy deep learning applications. Meanwhile, various attack techniques have also been discovered to stealthily compromise the model's integrity. When a cloud customer deploys a deep learning model in the cloud and serves it to end-users, it is important for him to be able to verify that the deployed model has not been tampered with, and the model's integrity is protected. We propose a new low-cost and self-served methodology for customers to verify that the model deployed in the cloud is intact, while having only black-box access (e.g., via APIs) to the deployed model. Customers can detect arbitrary changes to their deep learning models. Specifically, we define \texttt{Sensitive-Sample} fingerprints, which are a small set of transformed inputs that make the model outputs sensitive to the model's parameters. Even small weight changes can be clearly reflected in the model outputs, and observed by the customer. Our experiments on different types of model integrity attacks show that we can detect model integrity breaches with high accuracy ($>$99\%) and low overhead ($<$10 black-box model accesses).

26.0CRJul 5, 2018

Privacy-preserving Machine Learning through Data Obfuscation

Tianwei Zhang, Zecheng He, Ruby B. Lee

As machine learning becomes a practice and commodity, numerous cloud-based services and frameworks are provided to help customers develop and deploy machine learning applications. While it is prevalent to outsource model training and serving tasks in the cloud, it is important to protect the privacy of sensitive samples in the training dataset and prevent information leakage to untrusted third parties. Past work have shown that a malicious machine learning service provider or end user can easily extract critical information about the training samples, from the model parameters or even just model outputs. In this paper, we propose a novel and generic methodology to preserve the privacy of training data in machine learning applications. Specifically we introduce an obfuscate function and apply it to the training data before feeding them to the model training task. This function adds random noise to existing samples, or augments the dataset with new samples. By doing so sensitive information about the properties of individual samples, or statistical properties of a group of samples, is hidden. Meanwhile the model trained from the obfuscated dataset can still achieve high accuracy. With this approach, the customers can safely disclose the data or models to third-party providers or end users without the need to worry about data privacy. Our experiments show that this approach can effective defeat four existing types of machine learning privacy attacks at negligible accuracy cost.

2.3CRJul 5, 2018

Practical and Scalable Security Verification of Secure Architectures

Jakub Szefer, Tianwei Zhang, Ruby B. Lee

We present a new and practical framework for security verification of secure architectures. Specifically, we break the verification task into external verification and internal verification. External verification considers the external protocols, i.e. interactions between users, compute servers, network entities, etc. Meanwhile, internal verification considers the interactions between hardware and software components within each server. This verification framework is general-purpose and can be applied to a stand-alone server, or a large-scale distributed system. We evaluate our verification method on the CloudMonatt and HyperWall architectures as examples.

12.2LGJan 16, 2018

Time Series Segmentation through Automatic Feature Learning

Wei-Han Lee, Jorge Ortiz, Bongjun Ko et al.

Internet of things (IoT) applications have become increasingly popular in recent years, with applications ranging from building energy monitoring to personal health tracking and activity recognition. In order to leverage these data, automatic knowledge extraction - whereby we map from observations to interpretable states and transitions - must be done at scale. As such, we have seen many recent IoT data sets include annotations with a human expert specifying states, recorded as a set of boundaries and associated labels in a data sequence. These data can be used to build automatic labeling algorithms that produce labels as an expert would. Here, we refer to human-specified boundaries as breakpoints. Traditional changepoint detection methods only look for statistically-detectable boundaries that are defined as abrupt variations in the generative parameters of a data sequence. However, we observe that breakpoints occur on more subtle boundaries that are non-trivial to detect with these statistical methods. In this work, we propose a new unsupervised approach, based on deep learning, that outperforms existing techniques and learns the more subtle, breakpoint boundaries with a high accuracy. Through extensive experiments on various real-world data sets - including human-activity sensing data, speech signals, and electroencephalogram (EEG) activity traces - we demonstrate the effectiveness of our algorithm for practical applications. Furthermore, we show that our approach achieves significantly better performance than previous methods.

20.5CRAug 30, 2017

Implicit Smartphone User Authentication with Sensors and Contextual Machine Learning

Wei-Han Lee, Ruby B. Lee

Authentication of smartphone users is important because a lot of sensitive data is stored in the smartphone and the smartphone is also used to access various cloud data and services. However, smartphones are easily stolen or co-opted by an attacker. Beyond the initial login, it is highly desirable to re-authenticate end-users who are continuing to access security-critical services and data. Hence, this paper proposes a novel authentication system for implicit, continuous authentication of the smartphone user based on behavioral characteristics, by leveraging the sensors already ubiquitously built into smartphones. We propose novel context-based authentication models to differentiate the legitimate smartphone owner versus other users. We systematically show how to achieve high authentication accuracy with different design alternatives in sensor and feature selection, machine learning techniques, context detection and multiple devices. Our system can achieve excellent authentication performance with 98.1% accuracy with negligible system overhead and less than 2.4% battery consumption.

9.1CRAug 30, 2017

Secure Pick Up: Implicit Authentication When You Start Using the Smartphone

Wei-Han Lee, Xiaochen Liu, Yilin Shen et al.

We propose Secure Pick Up (SPU), a convenient, lightweight, in-device, non-intrusive and automatic-learning system for smartphone user authentication. Operating in the background, our system implicitly observes users' phone pick-up movements, the way they bend their arms when they pick up a smartphone to interact with the device, to authenticate the users. Our SPU outperforms the state-of-the-art implicit authentication mechanisms in three main aspects: 1) SPU automatically learns the user's behavioral pattern without requiring a large amount of training data (especially those of other users) as previous methods did, making it more deployable. Towards this end, we propose a weighted multi-dimensional Dynamic Time Warping (DTW) algorithm to effectively quantify similarities between users' pick-up movements; 2) SPU does not rely on a remote server for providing further computational power, making SPU efficient and usable even without network access; and 3) our system can adaptively update a user's authentication model to accommodate user's behavioral drift over time with negligible overhead. Through extensive experiments on real world datasets, we demonstrate that SPU can achieve authentication accuracy up to 96.3% with a very low latency of 2.4 milliseconds. It reduces the number of times a user has to do explicit authentication by 32.9%, while effectively defending against various attacks.

3.3DCMar 10, 2016

Memory DoS Attacks in Multi-tenant Clouds: Severity and Mitigation

Tianwei Zhang, Yinqian Zhang, Ruby B. Lee

In cloud computing, network Denial of Service (DoS) attacks are well studied and defenses have been implemented, but severe DoS attacks on a victim's working memory by a single hostile VM are not well understood. Memory DoS attacks are Denial of Service (or Degradation of Service) attacks caused by contention for hardware memory resources on a cloud server. Despite the strong memory isolation techniques for virtual machines (VMs) enforced by the software virtualization layer in cloud servers, the underlying hardware memory layers are still shared by the VMs and can be exploited by a clever attacker in a hostile VM co-located on the same server as the victim VM, denying the victim the working memory he needs. We first show quantitatively the severity of contention on different memory resources. We then show that a malicious cloud customer can mount low-cost attacks to cause severe performance degradation for a Hadoop distributed application, and 38X delay in response time for an E-commerce website in the Amazon EC2 cloud. Then, we design an effective, new defense against these memory DoS attacks, using a statistical metric to detect their existence and execution throttling to mitigate the attack damage. We achieve this by a novel re-purposing of existing hardware performance counters and duty cycle modulation for security, rather than for improving performance or power consumption. We implement a full prototype on the OpenStack cloud system. Our evaluations show that this defense system can effectively defeat memory DoS attacks with negligible performance overhead.