CLDec 22, 2020
ActionBert: Leveraging User Actions for Semantic Understanding of User InterfacesZecheng He, Srinivas Sunkara, Xiaoxue Zang et al.
As mobile devices are becoming ubiquitous, regularly interacting with a variety of user interfaces (UIs) is a common aspect of daily life for many people. To improve the accessibility of these devices and to enable their usage in a variety of settings, building models that can assist users and accomplish tasks through the UI is vitally important. However, there are several challenges to achieve this. First, UI components of similar appearance can have different functionalities, making understanding their function more important than just analyzing their appearance. Second, domain-specific features like Document Object Model (DOM) in web pages and View Hierarchy (VH) in mobile applications provide important signals about the semantics of UI elements, but these features are not in a natural language format. Third, owing to a large diversity in UIs and absence of standard DOM or VH representations, building a UI understanding model with high coverage requires large amounts of training data. Inspired by the success of pre-training based approaches in NLP for tackling a variety of problems in a data-efficient way, we introduce a new pre-trained UI representation model called ActionBert. Our methodology is designed to leverage visual, linguistic and domain-specific features in user interaction traces to pre-train generic feature representations of UIs and their components. Our key intuition is that user actions, e.g., a sequence of clicks on different UI components, reveals important information about their functionality. We evaluate the proposed model on a wide variety of downstream tasks, ranging from icon classification to UI component retrieval based on its natural language description. Experiments show that the proposed ActionBert model outperforms multi-modal baselines across all downstream tasks by up to 15.5%.
CRSep 17, 2020
New Models for Understanding and Reasoning about Speculative Execution AttacksZecheng He, Guangyuan Hu, Ruby Lee
Spectre and Meltdown attacks and their variants exploit hardware performance optimization features to cause security breaches. Secret information is accessed and leaked through covert or side channels. New attack variants keep appearing and we do not have a systematic way to capture the critical characteristics of these attacks and evaluate why they succeed or fail. In this paper, we provide a new attack-graph model for reasoning about speculative execution attacks. We model attacks as ordered dependency graphs, and prove that a race condition between two nodes can occur if there is a missing dependency edge between them. We define a new concept, "security dependency", between a resource access and its prior authorization operation. We show that a missing security dependency is equivalent to a race condition between authorization and access, which is a root cause of speculative execution attacks. We show detailed examples of how our attack graph models the Spectre and Meltdown attacks, and is generalizable to all the attack variants published so far. This attack model is also very useful for identifying new attacks and for generalizing defense strategies. We identify several defense strategies with different performance-security tradeoffs. We show that the defenses proposed so far all fit under one of our defense strategies. We also explain how attack graphs can be constructed and point to this as promising future work for tool designers.
CRFeb 10, 2020
Smartphone Impostor Detection with Built-in Sensors and Deep LearningGuangyuan Hu, Zecheng He, Ruby Lee
In this paper, we show that sensor-based impostor detection with deep learning can achieve excellent impostor detection accuracy at lower hardware cost compared to past work on sensor-based user authentication (the inverse problem) which used more conventional machine learning algorithms. While these methods use other smartphone users' sensor data to build the (user, non-user) classification models, we go further to show that using only the legitimate user's sensor data can still achieve very good accuracy while preserving the privacy of the user's sensor data (behavioral biometrics). For this use case, a key contribution is showing that the detection accuracy of a Recurrent Neural Network (RNN) deep learning model can be significantly improved by comparing prediction error distributions. This requires generating and comparing empirical probability distributions, which we show in an efficient hardware design. Another novel contribution is in the design of SID (Smartphone impostor Detection), a minimalist hardware accelerator that can be integrated into future smartphones for efficient impostor detection for different scenarios. Our SID module can implement many common Machine Learning and Deep Learning algorithms. SID is also scalable in parallelism and performance and easy to program. We show an FPGA prototype of SID, which can provide more than enough performance for real-time impostor detection, with very low hardware complexity and power consumption (one to two orders of magnitude less than related performance-oriented FPGA accelerators). We also show that the FPGA implementation of SID consumes 64.41X less energy than an implementation using the CPU with a GPU.
CRJun 18, 2018
Power-Grid Controller Anomaly Detection with Enhanced Temporal Deep LearningZecheng He, Aswin Raghavan, Guangyuan Hu et al.
Controllers of security-critical cyber-physical systems, like the power grid, are a very important class of computer systems. Attacks against the control code of a power-grid system, especially zero-day attacks, can be catastrophic. Earlier detection of the anomalies can prevent further damage. However, detecting zero-day attacks is extremely challenging because they have no known code and have unknown behavior. Furthermore, if data collected from the controller is transferred to a server through networks for analysis and detection of anomalous behavior, this creates a very large attack surface and also delays detection. In order to address this problem, we propose Reconstruction Error Distribution (RED) of Hardware Performance Counters (HPCs), and a data-driven defense system based on it. Specifically, we first train a temporal deep learning model, using only normal HPC readings from legitimate processes that run daily in these power-grid systems, to model the normal behavior of the power-grid controller. Then, we run this model using real-time data from commonly available HPCs. We use the proposed RED to enhance the temporal deep learning detection of anomalous behavior, by estimating distribution deviations from the normal behavior with an effective statistical test. Experimental results on a real power-grid controller show that we can detect anomalous behavior with high accuracy (>99.9%), nearly zero false positives and short (<360ms) latency.
LGJan 16, 2018
Time Series Segmentation through Automatic Feature LearningWei-Han Lee, Jorge Ortiz, Bongjun Ko et al.
Internet of things (IoT) applications have become increasingly popular in recent years, with applications ranging from building energy monitoring to personal health tracking and activity recognition. In order to leverage these data, automatic knowledge extraction - whereby we map from observations to interpretable states and transitions - must be done at scale. As such, we have seen many recent IoT data sets include annotations with a human expert specifying states, recorded as a set of boundaries and associated labels in a data sequence. These data can be used to build automatic labeling algorithms that produce labels as an expert would. Here, we refer to human-specified boundaries as breakpoints. Traditional changepoint detection methods only look for statistically-detectable boundaries that are defined as abrupt variations in the generative parameters of a data sequence. However, we observe that breakpoints occur on more subtle boundaries that are non-trivial to detect with these statistical methods. In this work, we propose a new unsupervised approach, based on deep learning, that outperforms existing techniques and learns the more subtle, breakpoint boundaries with a high accuracy. Through extensive experiments on various real-world data sets - including human-activity sensing data, speech signals, and electroencephalogram (EEG) activity traces - we demonstrate the effectiveness of our algorithm for practical applications. Furthermore, we show that our approach achieves significantly better performance than previous methods.
CRMar 10, 2017
Implicit Sensor-based Authentication of Smartphone Users with SmartwatchWei-Han Lee, Ruby Lee
Smartphones are now frequently used by end-users as the portals to cloud-based services, and smartphones are easily stolen or co-opted by an attacker. Beyond the initial log-in mechanism, it is highly desirable to re-authenticate end-users who are continuing to access security-critical services and data, whether in the cloud or in the smartphone. But attackers who have gained access to a logged-in smartphone have no incentive to re-authenticate, so this must be done in an automatic, non-bypassable way. Hence, this paper proposes a novel authentication system, iAuth, for implicit, continuous authentication of the end-user based on his or her behavioral characteristics, by leveraging the sensors already ubiquitously built into smartphones. We design a system that gives accurate authentication using machine learning and sensor data from multiple mobile devices. Our system can achieve 92.1% authentication accuracy with negligible system overhead and less than 2% battery consumption.
CRMar 9, 2017
Multi-sensor authentication to improve smartphone securityWei-Han Lee, Ruby Lee
The widespread use of smartphones gives rise to new security and privacy concerns. Smartphone thefts account for the largest percentage of thefts in recent crime statistics. Using a victim's smartphone, the attacker can launch impersonation attacks, which threaten the security of the victim and other users in the network. Our threat model includes the attacker taking over the phone after the user has logged on with his password or pin. Our goal is to design a mechanism for smartphones to better authenticate the current user, continuously and implicitly, and raise alerts when necessary. In this paper, we propose a multi-sensors-based system to achieve continuous and implicit authentication for smartphone users. The system continuously learns the owner's behavior patterns and environment characteristics, and then authenticates the current user without interrupting user-smartphone interactions. Our method can adaptively update a user's model considering the temporal change of user's patterns. Experimental results show that our method is efficient, requiring less than 10 seconds to train the model and 20 seconds to detect the abnormal user, while achieving high accuracy (more than 90%). Also the combination of more sensors provide better accuracy. Furthermore, our method enables adjusting the security level by changing the sampling rate.