Xia Yin

h-index5

5papers

438citations

Novelty54%

AI Score46

Ranked #63,642 of 201,326 authors (top 32%)#1,417 in CR (top 19%)

5 Papers

NIMar 31

TORCH: Characterizing Invalid Route Filtering via Tunnelled Observation

Renrui Tian, Yahui Li, Xia Yin et al.

To mitigate BGP prefix hijacking, the Resource Public Key Infrastructure (RPKI) provides prefix origin authentication via Route Origin Validation (ROV). Despite extensive measurement efforts in IPv4, the protective impact of ROV in IPv6 has yet to be systematically assessed. Existing approaches suffer from limited observability into invalid route propagation: they often rely on a small set of controlled prefixes or cannot fully profile the filtering of in-the-wild RPKI-invalid routes, which undermines the accuracy of assessment. Furthermore, the inherent opacity of the IPv6 data plane exacerbates the difficulty of performing scalable and reliable active measurements. In this paper, we present TORCH, a novel framework for measuring invalid route filtering in IPv6. It repurposes open 6in4 tunnel endpoints as widely distributed vantage points for global measurement. At its core, we develop a cross-plane inference technique that determines reachability without requiring responsive targets. This method allows us to characterize whether and how traffic is steered to invalid origins across diverse routing scenarios, leading to an in-depth evaluation of the real-world impact of ROV. Our measurements reveal that about 27\% of ASes have achieved nearly full ROV protection. However, several permissive Tier-1 ASes still transit traffic towards invalid origins, maintaining a substantial attack surface. Through a prefix-centric analysis, we provide the first empirical evidence that the collateral damage of same-length prefix filtering can affect a significant fraction of the global Internet. Our findings pinpoint fundamental vulnerabilities in ROV deployment and underscore the urgent necessity for network operators to accelerate RPKI adoption. We make our datasets publicly available.

LGJan 28

Minimum-Cost Network Flow with Dual Predictions

Zhiyang Chen, Hailong Yao, Xia Yin

Recent work has shown that machine-learned predictions can provably improve the performance of classic algorithms. In this work, we propose the first minimum-cost network flow algorithm augmented with a dual prediction. Our method is based on a classic minimum-cost flow algorithm, namely $\varepsilon$-relaxation. We provide time complexity bounds in terms of the infinity norm prediction error, which is both consistent and robust. We also prove sample complexity bounds for PAC-learning the prediction. We empirically validate our theoretical results on two applications of minimum-cost flow, i.e., traffic networks and chip escape routing, in which we learn a fixed prediction, and a feature-based neural network model to infer the prediction, respectively. Experimental results illustrate $12.74\times$ and $1.64\times$ average speedup on two applications.

CRNov 8, 2021

threaTrace: Detecting and Tracing Host-based Threats in Node Level Through Provenance Graph Learning

Su Wang, Zhiliang Wang, Tao Zhou et al.

Host-based threats such as Program Attack, Malware Implantation, and Advanced Persistent Threats (APT), are commonly adopted by modern attackers. Recent studies propose leveraging the rich contextual information in data provenance to detect threats in a host. Data provenance is a directed acyclic graph constructed from system audit data. Nodes in a provenance graph represent system entities (e.g., $processes$ and $files$) and edges represent system calls in the direction of information flow. However, previous studies, which extract features of the whole provenance graph, are not sensitive to the small number of threat-related entities and thus result in low performance when hunting stealthy threats. We present threaTrace, an anomaly-based detector that detects host-based threats at system entity level without prior knowledge of attack patterns. We tailor GraphSAGE, an inductive graph neural network, to learn every benign entity's role in a provenance graph. threaTrace is a real-time system, which is scalable of monitoring a long-term running host and capable of detecting host-based intrusion in their early phase. We evaluate threaTrace on three public datasets. The results show that threaTrace outperforms three state-of-the-art host intrusion detection systems.

CRSep 23, 2021

DeepAID: Interpreting and Improving Deep Learning-based Anomaly Detection in Security Applications

Dongqi Han, Zhiliang Wang, Wenqi Chen et al.

Unsupervised Deep Learning (DL) techniques have been widely used in various security-related anomaly detection applications, owing to the great promise of being able to detect unforeseen threats and superior performance provided by Deep Neural Networks (DNN). However, the lack of interpretability creates key barriers to the adoption of DL models in practice. Unfortunately, existing interpretation approaches are proposed for supervised learning models and/or non-security domains, which are unadaptable for unsupervised DL models and fail to satisfy special requirements in security domains. In this paper, we propose DeepAID, a general framework aiming to (1) interpret DL-based anomaly detection systems in security domains, and (2) improve the practicality of these systems based on the interpretations. We first propose a novel interpretation method for unsupervised DNNs by formulating and solving well-designed optimization problems with special constraints for security domains. Then, we provide several applications based on our Interpreter as well as a model-based extension Distiller to improve security systems by solving domain-specific problems. We apply DeepAID over three types of security-related anomaly detection systems and extensively evaluate our Interpreter with representative prior works. Experimental results show that DeepAID can provide high-quality interpretations for unsupervised DL models while meeting the special requirements of security domains. We also provide several use cases to show that DeepAID can help security operators to understand model decisions, diagnose system mistakes, give feedback to models, and reduce false positives.

CRMay 15, 2020

Evaluating and Improving Adversarial Robustness of Machine Learning-Based Network Intrusion Detectors

Dongqi Han, Zhiliang Wang, Ying Zhong et al.

Machine learning (ML), especially deep learning (DL) techniques have been increasingly used in anomaly-based network intrusion detection systems (NIDS). However, ML/DL has shown to be extremely vulnerable to adversarial attacks, especially in such security-sensitive systems. Many adversarial attacks have been proposed to evaluate the robustness of ML-based NIDSs. Unfortunately, existing attacks mostly focused on feature-space and/or white-box attacks, which make impractical assumptions in real-world scenarios, leaving the study on practical gray/black-box attacks largely unexplored. To bridge this gap, we conduct the first systematic study of the gray/black-box traffic-space adversarial attacks to evaluate the robustness of ML-based NIDSs. Our work outperforms previous ones in the following aspects: (i) practical-the proposed attack can automatically mutate original traffic with extremely limited knowledge and affordable overhead while preserving its functionality; (ii) generic-the proposed attack is effective for evaluating the robustness of various NIDSs using diverse ML/DL models and non-payload-based features; (iii) explainable-we propose an explanation method for the fragile robustness of ML-based NIDSs. Based on this, we also propose a defense scheme against adversarial attacks to improve system robustness. We extensively evaluate the robustness of various NIDSs using diverse feature sets and ML/DL models. Experimental results show our attack is effective (e.g., >97% evasion rate in half cases for Kitsune, a state-of-the-art NIDS) with affordable execution cost and the proposed defense method can effectively mitigate such attacks (evasion rate is reduced by >50% in most cases).